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David Hopwood was bom in 1933 in Kinver, England. He received his 
bachelor and Ph.D. degrees from the University of Cambridge, where he 
was an assistant lecturer in Botany for five years before moving to the 
Department of Genetics in the University of Glasgow in 1961. Since 
1968 he has been John innes Professor of Genetics in the University of 
East Anglia, Norwich, and head of the Genetics Department atlhe John 
Innes Centre. From the beginning of his research career he has pioneered 
studies of the genetics of Streptomyces, a member of the group of 
filamentous, Gram-positive soil bacteria called .the actinomycetes that are 
preeminent producers of polyketide and other antibiotics. Over the last 
1 0 years or so he has harnessed this genetic system to help to illuminate 
the mechanisms of polyketide biosynthesis. 

I. Introduction 

Over the last eight years or so, genetic techniques 
have spearheaded significant advances in our under- 
standing of the structure and mechanisms of poly- 
ketide synthases (PKSs), Much insight into these 
fascinating multifunctional enzymes had already 
been obtained by chemical and biochemical ap- 
proaches, including the establishment of a mecha- 
nistic relationship between polyketide and fatty acid 
biosynthesis, in which the carbon backbones of the 
molecules are assembled by the successive condensa- 
tion of small -acyl units. 1 Severe "difficulties had, 
however, been met in trying to understand a key 
issue in the field, namely how the enzymes are 
"programmed". This term has been introduced to 
describe control of the variables that determine the 
structure of the product of a specific PKS (Figure 1). 
These variables are choice of starter unit; choice of 
the nature and number of the rfrifljn extender units; 
control of the reductive cycle on the )8-keto group of 
the growing carbon chain, which in turn determines 
the keto, hydroxyl, enoyl, or methylene functionality 
at each alternate carbon atom; stereochemistry of 
hydroxyl and alkyl side groups; and pattern of 
cyciization of the nascent carbon chain. The pro- 
gramming problem is the aspect of polyketide bio- 
synthesis on which genetic studies have had the 
largest impact. First, the mere cloning and sequenc- 
ing of the structural genes for a variety of PKSs 
established the number and primary structures of the 
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protein subunits that make up any particular, syn- 
thase (Figure 2). E x amples were found of PKSs 
resembling each of the classical classes of fatty acid 
synthases (FASs): type I FAS, characteristic of fungi 
and vertebrates, in which the catalytic sites for the 
various steps in the biosynthesis— acyl transferase 
(AT), ketosynthase (KS), acyl carrier protein (ACP), 
, ketoreductase (KR), dehydratase (DH), enoylreduc- 
tase (ER) (Figure l)-are carried as domains along . 
the length of multifunctional proteins; and type II 
FAS, characteristic of bacteria and plants, in which 
each catalytic site is carried on a separate protein 
subunit. 3 Not surprisingly (at least in retrospect) the 
first example of a fungal PKS had a type I organiza- 
tion, while the first bacterial PKSs to be studied—for 
members of the aromatic family of polyketides-from 
the actinomycetes— turned out to have a type II 
structure. In contrast, and totally unexpectedly the 
gene sequences for PKSs for the macrolide polyketides 
of actinomycetes revealed not only a type I organiza- 
tion, hitherto known only in eukaryotes, but the 
presence of multiple sets, or modules, of active sites. 
ii.ach module resembled a vertebrate type I FAS, and 
the whole PKS consisted of a number of modules 
equal to the number of rounds of condensation 
required to build the polyketide product. Just as the 
structure of the double helix had immediately sug- 
gested a basis for the faithful replication of DNA, 14 
so too the primary structure of the macrolide PKS 
provided a compelling hypothesis for the program- 
ming of the enzyme. On this hypothesis, the program 
was hard-wired in the gene sequence and was ex- 
pressed m the encoded proteins as a series of active 
sites appropriately arranged in relation to each other 



The polyketide would be built on an assembly line 
represented by these active sites, with a loading 
module at the start; followed by the appropriate 
number of chain-extending modiiles in correct se- 
quence and each carrying the relevant complement 
of reductive sites; and an end domain represented 
byathioesterase. The thioesterase would hydrolyze 
the bond between the completed polyketide and the 
4'-phosphopantetheine prosthetic group "on the acyl 
carrier domain of the last chain-extending module, 
just as in a vertebrate FAS. This hypothesis had to 
be experimentally verified, but it arose from the 
sequence itself. In contrast, insight into the quite 
different programming mechanism of the nonmodu- 
lar PKSs, in which there is only one catalytic site of 
each type, and this has to act iteratively to build and 
modify the polyketide chain, came, not directly from 
the sequences of the genes, but from the results of 
their experimental manipulation. 

In this Review I have taken a somewhat historical 
view of the understanding of the PKSs acquired via 
genetic approaches, rather than attempting an ex- 
haustive review of developments in the field over a 
specific recent time period. Most of the genetic work 
with both the type II and . the modular PKSs has 
concerned the streptomycetes and their near relatives 
among the actinomycetes, and this occupies the bulk 
of the article in sections II and HE, respectively. A 
lesser amount of genetic work has involved the type 
I PKSs of filamentous fungi, reviewed in section IV. 
Outside of microbial systems, work on higher plants 
has provided a wealth of knowledge about a group 
of PKSs— the chalcone and stilbene synthases— that 
probably represent a separate line of evolution from 
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Figure 2. Architecture of fatty acid synthases (FAS) and polyketide synthases (PKS) deduced from the gene sequences. 
(Note the different scale for the type II synthases (top) and the type I and chalcone/stilbene synthases (bottom).) References 
to PKS gene sequences for Streptomyces, Soccharopolyspora , fungi, and plants are in sections II, EH, IV, and V, respectively, 
where these genes are discussed in detail. For the FASs the references are: E. coli; 4 vertebrates; 5-7 fungi; 8-12 
Brevibacterium. 13 The organization and possible evolutionary relationships between the various types of synthase are 
discussed later, in section VI. 



the other PKSs and all of the known FASs, and this 
is discussed in section V. Section VI is devoted to 
some evolutionary speculations. The genes that - 
encode the "tailoring" steps that typically follow the 
building and folding of the polyketide carbon chain 
are not discussed; genetics has greatly iUuminated 
the nature and mechanisms of many of the enzymes 
that catalyze these post-PKS steps also, but these 
studies merge into the whole field of metabolic 
pathway analysis and manipulation that f is not 
uniquely associated with the polyketides. Other 
recent reviews of polyketide biosynthesis, some of 
which take a more chemical stance than my article, 
include ref 2 and 15— 22. 

//. Aromatic Polyketide Synthases from 
Streptomyces Species and Related 
Actinomycetes 

A. Cloning of the Genes 

Soon after methods for gene cloning in Streptomy- 
ces species were published in 1980 23 " 25 it became 



possible to isolate genes for antibiotic biosynthesis 
by a variety of procedures. 26 ' 27 One of the first 
approaches involved the shotgun doling of rando m 
fcagments^ofJDNA from a wild-type„strain into -a 
mutant blocked„„at a step in. the biosynthesis and 
looking for a restoration of antibiotic production. A 
second early approach depended on emerging evi- 
dence for close linkage between genes for self- 
resistance to an antibiotic and one or more of the 
biosynthetic genes. From these early results a most 
important generalization soon emerged: that in 
streptomycetes, and by implication in other bacteria 
too, all of the biosynthetic genes needed to make a 
particular antibiotic from primary metabolites occur 
together in a single ^cluster , and that one or more 
genes for antibiotic self-resi stance are also to be found 
there. 28 A striking early demonstration of this was 
the cloning of the entire cluster of genes (the act 
genes) for biosynthesis of the pigmented benzoisoch- 
romanequinone polyketide actinorhodin (1, Figure 3) 
on a 35 kb fragment of chromosomal DNA from the 
producer, Streptomyces coelicolor A3(2), and their 
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Table 1. Cloning and Sequencing of Aromatic PKS Gene Clusters From Actinomycetes* . 
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° Other PKS gene sets that have been doned but not (fully) sequenced indude those encoding the PItSs for aclacmomycux 59 ^ 0 
kalafungin, 61 eUoramycui, 62 PD117420, and tetrangulol. 63 6 Key: 1, complementation" of pathwayrblocked mutants; 2, gene 
disruption; 3, production of antibiotic after transfer of doned genes to S. lividans; 4, production of relevant compounds in 
recombinant strains. c These include references to the cloning and/or sequencing of the PKS genes. : 



of the pigment was (and still is) unknown, but the 
sequence of the complementing DNA left little doubt 
that it represented PKS genes for an aromatic 
polyketide (see below), and indeed the act and whiE 
PKS genes cross-hybridized, so that the actl probe 
revealed two bands when hybridized to S. coelicolor 
genomic DNA. 33 Probably reflecting a similar situ- 
ation, use of the act probes to try to isolate the PKS 
for the simple aromatic polyketide airamyciri from 
Streptomyces curacoi yielded DNA that may well 
encode the PKS for a polyketide spore pigment rather 
than the antibiotic, although this remains un- 
proven. 36 

In spite of these setbacks, the act probes have been 
instrumental in the isolation of sets of genes that 
encode the PKSs for a variety of aromatic metabolites 
(Figure 3), so that by now at least 18 have been 
doned and sequenced (Table 1), including further 
exam ples of beiiao isochix)m anequinones r nanaomycin 
(5), frenohc£n"(6y, and griseusin (7); and antfaracy - 
clines , daunorubicin (8), the closely related doxoru- 
bicin (adriamycin) and nogalamycin (9); two angucy- 
clines , jadomydn (10) and urdamycin (li); and the 
aureolic acid, derivative mithramycin (12). Some 
others fiave been cloned but not yet sequenced (Table 
1, footnote). Table 1 also lists the classes of evidence 
adduced to prove involvement of each gene set in 
biosynthesis of the target polyketide. . The most 
general of these are (1) complementation of bona fide 
blocked mutants by the cloned PKS DNA to restore 
production of the relevant polyketide; (2) disruption, 
using fragments of the cloned DNA, of the corre- 
sponding genes in the wild-type strain, to generate 
a nonproducing phenotype; and (3) transfer of a set 
of antibiotic biosynthetic genes, including the pre- 
sumptive PKS genes, to a heterologous host (usually 
S. lividans), resulting in production of the relevant 
antibiotic (or a precursor of it). 



[ It is now clear that, although all PKSs (except the 
chalcone/stilbene family) may well share the same 
evolutionary origin, the synthases for .the aromatic 
class and the "complex* pr oduced* class of poly- 
ketides (most famously represented by the various 
kinds of macrolides) appear to be too far diverged 
from each other for sigruficant.cross-hybridiza ; tion at 
the DNA level to occur. Thusthe act probes are in 
general not useful for isolating ge nes for the modu lar 
farnily of PKSs (but see under soraphen, section 
m.B.7).~Fdr"these, a different series of probes, 
usually segments of DNA encoding various domains 
of the erythromycin PES, are effective (see section 
HI). Both classes of PKS genes appear to have 
diverged too far from any . FAS genes for cross- 
hybridization at the DNA level to occur, so hybridiz a- 
tion .of ^ PKS^probes with FAS' genes is not usually -a 
problem. , 

B. Architecture of the Genes. 

The first clue to the architecture of a bacterial 
polyketide synthase came from the sequencing of a 
gene that complemented the so-called actJH class of 
actinorhodin blocked mutants of S. coelicolor A3(2). 
The actJH mutants represented one of two classes of 
mutants that were deduced to be interrupted in very 
early steps in the biosynthetic pathway because they 
failed to secrete any metabolite that could be con- 
verted by other mutants to actinorhodin, but they 
would convert to actinorliodin compounds secreted 
by four different classes of mutants, which were 
therefore deduced to be blocked at later steps. 64 
Whereas the actl mutants were nonpigmented, and 
so were probably defective in assembly of the 
polyketide chain , the actm mutants produced a red 
diffusible pigment and were therefore likely to be 
defective in an early step in chain modification rather 
than in chain assembly itself. Sequencing of the 
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fragments of wild-type DNA into any available 
mutant blocked in a step of antibiotic biosynthesis, 
looking for complementation of the mutation, and 
then finding genes for the other steps of the pathway 
on the complementing fragments; and (2) cloning a 
library of DNA fragments from an antibiotic producer 
into a sensitive surrogate host (usually a derivative 
of Streptomyces lividans 66, which is a convenient, 
generally antibiotic-sensitive and easily manipulated 
strain), selecting resistant clones, and seeking bio- 
synthetic genes linked to the resistance gene on the 
cloned DN^, By these procedures, the complete sets 
of biosynthetic genes for two further aromatic 
polyketides were isolated: the anthracycline, tefc 
r acenomycin (2) item) from Streptomyces glauce- 
scehs™ anT oxyt^mcycline ( 3) (otc) from Streptomy- 
ces rimosns. 3 * 

The availability of cloned DNA carrying the bio- 
synthetic gene clusters for these three aromatic 
polyketides led to a test of the idea 29 that the 
sequences of different PKS genes, which are pre- 
sumed to have diverged from a common ancestor . 
(discussed in section VI), might be sufficiently con- > 
served for a DNA fragment for one synthase to be 
used as a jprphe to isolate genes for others. The 
presumptive positions of the DNA encoding the 
polyketide J23 and KR functions within the actino- 
rhodin gene cluster had already been deduced 32 and 
so these DNA fragments (carrying the so-called actl 
and actUI g enes, respectively) could be used~as 
hybridization probes against restriction digests of the 
tcm and otc cloned DNA- 33 Not only did strong cross- 
hybridization occur with the KS probe, but it recog- 
nized regions of the tcm and otc DNA clusters that 
had already been identified by complementation 
analysis as candidates for carrying the PKS genes/ 
The KR probe hybridized to a second segment of the 
otc gene cluster, but not to any part of the cloned tcm 
DNA. This latter result was significant because 
tetracenomycin is one of the few polyketides that 
arise without any of the keto groups of the nascent 
carbon chain being reduced. 

In the same study, the actl and actJH probes were 
hybridized to Southern blots of restriction digests of 
total genomic DNA of 25 actinomycetes, including 18 
polyketide producers and seven nonproducers: hy- 
bridizing bands were revealed in most, but not all, 
of the producers, and were absent from most, but not 
all, of the nonproducers. These results appeared to 
establish a strong, although- imperfect, correlation 
between the presence in a particular actinomycete 
of DNA sequences that cross-hybridized with one of 
both of the act PKS probes and production of 
polyketides by that strain, suggesting that the probes 
could indeed be used to isolate further PKS genes. 
This was successfully accomplished for a second 
benzoisochromanequinone, granati rin(4). from Strep- 
tomyces violaceoruber. 33 Proof that the hybridizing 
sequences isolated from this strain indeed encoded 
the granaticin PKS was obtained by a gene disrup- 
tion experiment. In this technique a fragment of a 
gene is introduced into the wild-type strain and a 
homologous crossover occurs between it and the 
resident copy of the same DNA to generate a non- 
producing mutant (Figure 4). Similar experiments 
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Figure 4- A simple gene disruption scheme. A plasmid 
carrying an internal fr agment of a target gene to be 
disrupted, together with an antibiotic resistance marker 
gene, is introduced -into an antibiotic producing host strain 
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a recombiriant in which the vector hg« become integrated 
by crossing-over, as shown, will be able to growl (Figure 
kindly provided by M. J, Buttner.) > ; 

were, also done with the producer of the macrolide 
milbemycm, S. hygroscopicus ssp. aureolocrimosus , 
from which DNA hybridizing to the act probes had 
also been isolated. In fids case the results of the gene 
disruption -experiment were less clear-cut,, because 
the nonproducing phenotype was associated with 
unexpected chromosomal deletions from the region 
surrounding the presumptive PKS DNA of the wild- 
type itost; in retrospect this result should perhaps 
have set off a warning belL 

Following publication of the Malpartida et aZ. 33 > 
paper, the act probes were much in demand . fbf - 
attempts to isolate genes for further PKSs. Judging 
from the published literature, these attempts were ' 
all successful. However, it gradually became appar- 
ent that not all of the genes isolated using these 
probes actually encoded the PKS sought! For ex- 
ample, Arrowsmith et alJ* isolated PKS^genes from 
the producer of the p6lyeth£r monensin, 'Streptomyces 
cinnamoTiensis, but their disruption failed to interfere 
with monensin biosynthesis. By now, the first ex- 
ample of a modular PKS, for biosynthesis of the 
macrolide erythromycin by Saccharopolysppra eryth- 
raea, had been discovered (see section TTT.AX and the 
act PKS probes failed to hybridize to the DNA 
encoding the erythromycin PKS; doubtless they did 
not hybridize to the monensin PKS genes either, but 
these have not been identified. Presumably genes 
for a PKS for an unidentified aromatic polyketide had 
been isolated from S. cinnamonensis. A second 
complication was revealed by the isolation and se- 
quencing of DNA that complemented a mutation 
(called whiE) that prevents biosynthesis of the gray- 
brown spore pigment of 5. coelicolor A3(2), resulting 
in a white spore phenotype. 35 The chemical nature 
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actUL region revealed an open reading frame that 
would encode a protein resembling several known 
oxidoreductases, such as ribitol dehydrogenase from 
Klebsiella aerogenes and alcohol dehydrogenase from 
Drosophila melanogaster. 37 It was therefore deduced 
that the actUI DNA encoded a discrete polyketide 
ketoreductase. By implication the PKS would have 
a type II structure with separate proteins for the 
individual reactions of chain assembly and modifica- 
tion. 

Sequencing of the ac£l region, believed to encode 
the PKS KS, was delayed in relation to that of the 
corresponding PKS-encoding segments of the gene 
clusters for granaticin (gra) and tetracenomycin 
(tcm), which had meanwhile been isolated (see above). 
The sequences of the presumptive gra and tcm PKS- 
encoding DNA 45 * 42 immediately confirmed the type H 
nature of the PKSs by revealing open reading frames 
that would encode proteins resembling the condens- 
ing enzyme (the product of fabB) of the E. coli FAS 65 
and the discrete acyl carrier proteins of the type EE 
FASs of bacteria and plants. When the actl region 
was later sequenced, corresponding genes were iden- 
tified. 38 

As well as putative structural genes for a KS, an 
ACP and (for act and gra but not tcm, as expected) a 
KR, a "mystery gene* was revealed lying immediately 
downstream of each KS gene and showing putative 
translational coupling with it (a situation in wjiich 
there is overlap between the stop codon of the 
upstream gene and the start codon of the down- 
stream gene: this arrangement is postulated to 
facilitate cotranslation of two bacterial genes to yield 
equimolar amounts of their protein products). 66 Hie 
downstream genes (initially called open reading 
frame 2: a ORF2 w ) encoded proteins that resembled 
strongly the KSs COKF1"), but lacked the putative 
catalytic site for condensation (a characteristic motif 
based on a cysteine residue). At this stage, the idea 
that the "ORF2" sequences were nonfunctional pseudo- 
genes was a formal possibility, but the subsequent 
finding that a mutation in ac£l-ORF2 abolished 
actinorhodin production (see section II. C) 67 ruled this 
out. I return to the role of a ORF2 w later (section 

n.F.i). 

A further function encoded by the act cluster, 
involved in cyclization of the nascent polyketide 
carbon chain, had been proposed by consideration of 
the phenotype, of tlxe so-called acfVIL mutants^ 64 
which wefe-:fpund to secrete an incorr-ectly cyclized 
shunt product, mutactin. 68 The sequence of .the 
dctVH gene, 38 and of its homologue in the gra 
cluster, 45 gave no clue to the biochemical role of the 
products of these genes. However, arguments de- 
scribed in section II.E, based on the structure of 
mutactin, suggested that these act and gra genes 
would encode a bifunctional cyclase/diehydratase. In 
contrast, a gene from the tcm cluster (tcmN), which 
would encode a protein whose N-terminal half re- 
sembled the corresponding regions of the act and gra 
cyclases/dehydratases, but whose C-terminal half 
resembled instead O-methyltransferases, would en- 
code a bifunctional cyclase/O-methyltransferase. 43 * 69 
I return later to the role of the actVLI gene also. 
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In describing the architecture of the act cluster, a 
sixth gene also needs to be introduced. This is actTV, 
whose function as a cyclase became apparent only 
as a result of the genetic engineering studies that are 
discussed below. The arrangement of the six act PKS 
genes, and for comparison those of the genes that 
encode other actinomycete aromatic PKSs, are shown 
in Figure 5. There are striking resemblances in 
overall architecture between the various clusters, but 
also differences in the arrangements of some of the 
homologous genes. 

C. Would "Hybrid" Synthases Work? 

The potential for producing new, ''hybrid" antibiotic 
structures by engineering novel combinations of 
antibiotic biosynthetic genes from different organ- 
isms was first demonstrated when segments of the 
act biosynthetic gene cluster, or the whole cluster, 
were transferred into the streptomycetes that pro- 
duce medermycin or granaticin (and dihydrograna- 
ticin). 70 * 71 The hybrid compounds discovered in these 
studies, and in later experiments involving the car- 
bomycin and spiramycin biosynthetic genes, 72 were 
polyketides, but the genes that were recombined to 
make them encoded late, tailoring steps in the 
biosynthetic pathways (catalyzed by individual en- 
zymes such as reductases, hydroxylases, or other 
group transferases) rather than steps in polyketide 
chain assembly and immediate postassembly modi- 
. fication. While there was hope that more radical 
engineering of product structure could be achieved 
by bringing together subunits of type II PKSs from 
different organisms to form functional hybrid syn- 
thases, this was not a foregone conclusion; specific 
protein— protein interactions might have evolved over 
long periods of evolutionary history to ensure that a 
set of subunits could work efficiently together to form 
a functional synthase. The first indication that this 
would not necessarily be a barrier to effective "mix- 
ing-and-matching" of PKS subunits was provided by 
the complementation of a mutation in the actUI (KR) 
gene of S. coelicolor by the homologous DNA from 
the granaticin gene cluster and what was at that time 
presumed to be the milbemycin gene cluster to 
produce pigments similar or identical to actinorho- 
din. 33 A reciprocal experiment was later reported by 
B artel et a/., 73 who transformed a UK-negative mu- 
tant of Strepiorriyces galUdeUs that produced 2-hy- 
droxyaklavinone with the ocflll gene and restored 
aklavinone production. Importantly, /they went fur- 
ther 73 in demonstrating the production of a novel (for 
S. galilaeus) polyketide, aloesaponarin II (see below) 
by wild-type S. galilaeus transformed with a plasmid 
carrying acrt-ORFl and actl'OBF2; 38 this was an 
example of "mixing-and-matching* in which there 
must have been effective cooperation of the act KS 
with heterologous subunits. 

In order to investigate further the possibilities for 
heterologous PKS subunit interactions, attempts 
were made to complement mutations in each of the 
five so far identified act PKS subunit genes (actUI, 
acfl-OBFl, ac*I-ORF2, acfI-ORF3, and actVU) by the 
corresponding genes from the gra set of S. violace- 
oruber. Since the assembly, ketoreduction, and cy- 
clization of the polyketide chains that are later 
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tailored to either actinorhodin or granaticin were 
expected to be identical™'* these experiments would 
test for the productive interaction of heterologous 
r'Jib subunits without requiring the hybrid PKS to 
generate a novel product, which could perhaps have 
failed for purely chemical reasons. The results were 
very -encouraging." Complementation of the actVH 
(KK) mutant by the gra KR gene (ORF5: Figure 5) 
was confirmed, and complementation of actVH (cy- 
clase/dehydratase) mutants by the corresponding gra 



gene (ORF4) was also clearly demonstrated. The 
situation was more complex for actl, because the 13 
available mutants had not previously been assigned 
to one or other of the three ORFs. Complementation 
tests losing gra-OBFl, gra-OBF2, and gra-OKF3 
allowed a number of conclusions to be drawn- (1) 
several octl-ORFl mutants were identified by het- 
er ?T^™ COmplementation by gra-OBFl; (2) a single 
actI-OKF2 mutant was clearly identified by comple- 
mentation bysra-ORF2, thereby demonstrating not 
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only heterologous complementation but also the 
essential nature of the ORF2 gene; and (3) none of 
the actl mutants was complemented by gra-ORF3, 
indicating either that the set of actl mutants included 
no examples of lesions in this small ACP gene, or a 
failure of heterologous complementation. 

These early results were soon extended by (1) 
unambiguous demonstration of the requirement of 
the actI-OKF3 gene product, the ACP, in a functional 
PKS; 76 (2) functional replacement of the act ACP by 
the corresponding ACPs from the gra, tern, otc, and 
putative fren PKSs 77 and even, at a low level, by a 
putative FAS ACP from Saccharopolyspora eryth- 
roea; 76 and (3) complementation of an actl-ORFl (KS) 
mutant, not only by the corresponding gra gene as 
reported earlier, but by homolpgues from the otc and 
whiE (spore pigment) gene clusters. 78 The stage was 
now set for a systematic mix-and-match approach in 
which the structures of polyketides generated by 
recombinants containing hybrid PKS gene clusters 
could begin to reveal the programming rules for the 
type II PKSs, 79 ' 80 but first a new genetic test system 
was needed. . 

D. A Special Host-Vector System for 
Construction and Expression of Recombinant 
PKSs 

The studies just reviewed, which established the 
feasibility of a systematic search for novel polyketides 
generated by hybrid PKSs, were carried out in strains 
of S. coelicolor with point mutations or deletions in 
individual PKS genes of the act cluster, which 
consists of 22 structural, regulatory, export, and 
resistance genes. 37 * 38,81-83 It soon became apparent 
that these strains were far from ideal for chemical 
analysis, since recombinants could potentiaDy fceher^ 
ate complex mixtures of end products, shunt prod- 
ucts, and degradation products through the action of 
the act genes that were still functional in the 
mutants. 76 It was therefore decided to build sets of 
aromatic PKS subunits and to express them in the 
absence of the enzymes for tattoring pathway steps, 
whose natural role is to diversify the primary prod- 
ucts of the PKSs themselves. A special host— vector 
system was engineered for this objective. The host 
is an S. coelicolor A3(2) derivative (strain CH999) 
from which the entire set of 22 act genes (except for 
one, actVI-OBFA, at the extreme lefthand end of the 
cluster) 8 2 has been deleted and replaced by a conve- 
nient marker gene for erythromycin resistance. 84 
(This was achieved by a variation in the gene disrup- 
tion procedure in which a segment of DNA from each 
end of the cluster is cloned on a suicide vector on 
either side of the marker gene to form a replacement 
"cassette". Double crossing-over between this cas- 
sette and the corresponding chromosomal sequences 
gives rise to the desired replacement.) Into this host 
is introduced, by a standard protoplast transforma- 
tion procedure, 85 any of a series of plasmids carrying 
the desired sets of PKS subunit genes. These plas- 
mids are based on the replicon of a stably inherited, 
low copy number S. coelicolor plasmid, SCP2*, 86 onto 
which the tsr marker for thiostrepton resistance in 
Streptomyces is cloned. (Low copy number is often 
preferable to multicopy cloning of antibiotic biosyn- 
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Figure 6. The vector pEMS. 84 The plasmid is bifunctional 
for replication in both E. coli (ColEl origin of replication 
(on) and ^-lactamase (bla) gene for selection) and Strep- 
tomyces (SCP2* origin of replication and thiostrepton 
resistance gene (tsr) for selection). The cloned PKS genes 
to be expressed in Streptomyces— in this example the act 
KS, CUF, ACP, ABO, CYC, KR genes— are cloned down- 
stream of their natural promoters, Paeti and Pactm, which 
are transcriptionally activated by the product of the actU- 
ORF4 gene. (Figure kindly drawn by T. Kieser.) 

thetic genes in Streptomyces , probably to avoid clone 
instability caused by physiological stress.) The vector 
also carries an E. coli origin of replication from 
ColEl, and the ampiculin resistance gene (bid) for 
selection in E. coli; thus rapid genetic engineering 
can be carried out ih E. coli, before transfer of the 
finished constructs to S. coelicolor CH999. A further 
component of the plasmid is the ac£lI-ORF4 gene, the 
natural pathway-specific activator of the act biosyn- 
thetic genes; 81 its product serves to activate tran- 
scription, in an appropriate, developmentally con- 
trolled manner, from the ac£l and actUl promoters, 
which form a divergent pair. 37 * 38 Downstream of one 
or both of these promoters are cloned the desired PKS 
subunit genes. In pRM5 (Figure 6), the founding 
member of the family of plasmids, 84 this set consists 
of the actUl, actl (3 GRFs), actVU and actVf genes; 
they are cloned in their natural arrangement (Figure 
5), except for the translational decoupling of some 
pairs of genes and the introduction, between each 
pair, of convenient unique restriction sites to aid trie 
construction of recombinants by the exchange of 
genes. CH999 carrying pRM5 or a series of plasmids 
based on it has generated a great deal of new 
information about the roles of the aromatic PKS 
subunits, which will how be summarized. Although 
not following precisely the historical sequence, it will 
be convenient first to describe experiments with 
recombinants carrying subsets of PKS genes from the 
same strain, and then experiments with hybrid 
clusters containing genes originating from two or 
more Streptomyces species. 

E, Recombinants Carrying Nonhybrid Subsets of 
PKS Genes 

CH999 carrying pRM5 was found to produce a 
compound, aloesaponarin II, which had earlier been 
identified by Bartel et aZ. 73 in the original actVl 
mutant class. Cosynthetic studies had suggested 
that the actVI mutants were blocked immediately 
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after the ae*IV mutants m the biosyhthelic se- 
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actinorhodiiL 6 * Aloesaponarin H was postulated to 
arise as a shunt compound from the hypothetical 
pathway intermediate (13: Figure 7) produced by ttie 
enzymes encoded by the actl + m + Vn-H IV genes. 7 * 
Lhis intermediate would be an octaketide (Lei, C 16 ) 
that had undergone the correct C-9 reduction and 
dehydration and correct formation of the two car- 
bocychc rings characteristic of the actmorhodin half- 
molecule. In the actVI mutants, 13 was postulated 
to undergo spontaneous formation of a third carbocy- 
ciic ring to generate an anthraquinone system, fol- 

nTi 6 ^ t7 ^^^f^ to ****** aloesaponarin 
U (14).™ The isolation of aloesaponarin H in signifi- 



^SSSSSf {UP to ; 10 1 ° from' cultures of 

X tL999/pRM5 was not only an encouraging start for 
toe project but was entirely consistent with these 
earker findings because pRM5 in fact carries pre- 
cisely the set of genes (actl + HI + VII + IV) that 
were postulated to be responsible for formation of 
aloesaponarin II in the atfVI-blocked mutants. More- 
over CH999/pRM5 also made 3,8-dihydrozy-l-meth- 
yiantJiraquinone-2-carbo^rlic acid (DMAC 15) the 
expected undecarboxylated precursor of alwsa- 
P, 0 *^ 5 results established pRM5 as the 

ideal starting point for the recombinant construction 
project because it evidently contained all the genes 
needed to encode a correctly programmed PKS with- 
out the complication of the tailoring enzymes 
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Among the set of genes present on pEM5, it turned 
out that all three of the genes in the actl segment 
were needed ; to form any recognized product in the 
CH999 host; they were therefore called the "minimal 
PKS" genes. 88 (It was a happy chance that the 
"classical" actl mutants, although potentially repre- 
senting lesions in three different genes— ORF1, ORF2, 
and ORF3— all correspond to mutations in the mini- 
mal PKS, thereby providing a retrospective rational- 
ization for designating them ac£l-ORFl, ac£l-ORF2, 
and actI-OKF3; and luckily the other relevant mutant 
classes— actlTL, actVTL, and actTV— each represent 
mutations in a single gene.) The product of the act 
minimal PKS was SEK4 (16) 89 with an unreduced 
(as expected) Ci 6 chain and just the first carbocyclic 
ring correctly formed by aldol condensation between 
C-7 and C-12; the rest of the structure, as in all the 
shunt products to be described, would have arisen 
by the most probable uncatalyzed chemistry, in this 
case to give a hemiketal at the methyl end of the 
chain and a pyrone at the carboxyl end. 

When the actUI gene was added to the minimal 
PKS gene set, its predicted product, the KR, reduced 
the C-9 carbonyl, giving the shunt product mutactin 

(17) . 87 This also agreed with precedent, because the 
construct carrying actl + III would be equivalent to 
an actVIL mutant, in which mutactin had originally 
been identified. 68 Mutactin is bicyclic, but the second 
ring is not the one found in actmorhodin, leading 
Zhang et al. 68 to postulate that the actVJI gene 
encodes a second ring cyclase; C-9 is also not dehy- 
drated, so the idea of a bifunctional cyclase/dehy- 
dratase was born. 69 

When McDaniel et al 87 added actVTT to actl + m 
they identified a further novel shunt product, SEE34 

(18) , which still lacked the correct second ring, but 
which, unlike mutactin, now had an aromatic first 
ring; the actVTL protein was therefore renamed an 
"aromatase", which would do its job by carrying out 
two dehydrations on the first ring. This proposal was 
speculatively linked to the finding that the actVU 
protein may have two separate functional domains 
representing the N- and C-terminal halves. Their 
primary sequences are similar, suggesting that they 
might have originated by an ancestral gene duplica- 
tion, a feature that was noticed first for the corre- 
sponding gene of the frenoJUcin (fren) PKS cluster, but 
is retrospectively apparent also in its homologues in 
the other gene clusters. 50 I return to this point later. 

Finally, when the actTV gene was added to the 
growing set of act genes (I + m + VII), as in pRM5 
itself, the production of DMAC and aloesaponarin II 
implied that actTV encoded the true second ring 
cyclase. 87 The actTV protein had earlier been found 
to have sequence similarities with some Zn 2+ - 
containing ^-lactamases, 38 and it was pointed out 
that the second ring cyclization involves an aldol 
condensation and that a class of aldolases are Zn 2+ - 
containing enzymes! 87 

That the S. coelicolor CH999 host was not provid- 
ing a unique background for such experiments had 
been indicated by the results of Bartel et al? z and 
was reinforced by those of Kim et al.; 78 both groups 
identified mutactin and aloesaponarin II as the 
products of suitable combinations of the act PKS 



subunits in a heterologous, polyketide nonproducing 
host, Streptomyces parvulus. This system was used 78 
to confirm that the presumed active site Cys in the 
acrt-ORFl (KS) protein was essential for polyketide 
synthesis, as expected from sequence comparisons 
with other condensing enzymes (a result also found 
for the corresponding residue in the tern KS); 90 and 
that the presence of the ac£l-ORF2 gene was also 
required for any product formation. 

In other experiments, subsets of the tcm PKS gene 
cluster were expressed in the CH999 host, leading 
to some comparable conclusions for their roles to 
those of the act genes. For example, the tcm homo- 
logues of the three ac£l gene products yielded an 
unreduced C20 compound, SEK15 (19: Figure 8), in 
contrast to the Cie SEK4 (16) made by the corre- 
sponding act gene products. 89 Clearly chain length 
was being determined by the minimal PKS, but it 
needed some of the first mix-and-match experiments 
to try to attribute this property to a specific subunit 
of the PKS. Similarly, the finding. of the natural first 
ring in SEK4, with a C-7/C-12 condensation, had 
suggested that the minimal PKS could control first 
ring formation, but SEK15 also arose by C-7/C-12 
first ring condensation, even though the natural 
course of tetracenomycin biosynthesis in S. glauc- 
escens involves a C-9/C-14 condensation to produce 
the first ring. Again, mix-and-match experiments 
were needed to throw light on control of the re- 
giospecificity of cyclization. These will now be de- 
scribed. 

F. Recombinants Carrying Hybrid Sets of PKS 
Genes 

The first hybrids to be constructed using the 
CH999/pRM5 host— vector system involved all pos- 
sible combinations of the acfl-OKFl, -ORF2, and 
-ORF3 genes and their homologues from the^ra, tern, 
and fren PKS gene sets. 84 * 91 The actTTT (KR) gene was 
present in all these early constructs, as were the 
actVTL and actTV. genes (later identified as aromatase 
and cyclase genes, respectively, as just described). 
The results were dramatic, in that the majority of 
the recombinants generated polyketide products; 
these included the known octaketides DMAC and 
aloesaponarin II, and three novel compounds: a 
decaketide (RM20: 20), a nonaketide (KM18: 21), and 
an octaketide (EMl8b: . 22). Over the last three years 
more than 30 novel compounds have been generated 
by the mix-and-match approach. These have not only 
established combinatorial biosynthesis as a route to 
interesting new compounds ( < ^mnatural natural prod- 
ucts") 99 ' 100 but have thrown considerable light on the 
roles of the various PKS subunits in programming. 



The most striking conclusions to be drawn from the 
first set of results concerned the determination of 
carbon chain length. One deduction was very clear: 



/. Chain-Length Determination: Identification of the 
Chain-Length Factor 
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the chain length was Cj.6 for act and gra, C20 for tcm 
and a mixture of Ci 8 and Cie for fren. This last 
fjyiHiTig of both nona- and octaketides (RM18 (21) and 
RM18b (22)) for the fren PKS was significant because 
S. roseofulvus, the natural host for the genes, pro- 
duces a mixture of the nonaketide frenolicin and the 
octaketide nanaomycin. (The characterization of 
RM18 and RM18b in the recombinants actually 
provided evidence that the fren genes did in fact 
encode a PKS responsible for biosynthesis of both 
frenolicin and nanaomycin, a problem that could not 
be resolved by gene disruption because of the intrac- 
tability of S. roseofulvus to genetic transformation.) 50 
Construction of heterologous ORF1 and ORF2 com- 
binations met with mixed success. Both combina- 
tions of act and gra were functional, but yielded no 
information on chain-length determination because 
both the natural PKSs produce octaketides. The 
combinations of fren or tcm ORF1 with act OKF2 also 
yielded products. In contrast, the reciprocal combi- 
nations, act ORF1 with fren or tcm ORF2, were 
inactive, as were both combinations of ORF1 and 
ORF2 from the tcm and fren clusters. ' Nevertheless 
it was possible to conclude that the ORF2-encoded 
PKS components determined chain length, "at least 
inparf.**? 1 This was because the combination of act 
ORF1+ORF2 produced an octaketide; this was not 
changed by substituting act OEF1 by either fren or 
tcm ORF1; but when both act ORF1 and act ORF2 
were replaced by the corresponding /ren or tern genes, 
a mixture of a nonaketide and an octaketide, or a 
decaketide, respectively, was produced. The product 
of the ORF2 gene was therefore named the "chain- 
length factor" (CI4F) to provide a convenient epithet 
for this hitherto "mystery*' gene. : , ;-. Vf 

It was originally suggested 42 that the ORF1- and 
ORF2-encoded PBS subunits might dissociate to form 
a heterodimeric KS, reminiscent of the known ho- 
modimeric E. coli enzymes, 101 and this remains a 
reasonable hypothesis. The finding that these two 
proteins are more "choosy* in their ability to form 
an active recombinant PKS than other components 
such as the AGP is consistent with this idea, since 
specific protein— protein interactibns ; could well be 
required to assemble a functional dinner from these 
similar but not identical subuiiits. It may be relevant 
that, while the" arrangement ofrthe .jgenes in the 
different PKS clusters varies (Figure 5), there is so 
far no exception out of at least 18 examples to the 
finding of the CLF gene just downstream of the KS 
gene, and coupled to it in all except the fren and dau 
clusters. Phylogenetic analysis clearly implies that 
the ORF1 and ORF2 genes originated by a gene 
duplication in an ancestral PKS gene set and have 
then diverged (Figure 9). It also appears that the 
ORF2 proteins are more diverged from each other, 
and from the presumed common ancestor shared with 
the ORF1 proteins, than are the ORF1 proteins from 
each other. 51 This is consistent with, although not 
of course proving, the idea that the ORF2 proteins 
play a more specific role in chain-length determina- 
tion than the ORF1 proteins, which would serve 
primarily to catalyze the condensation reaction itself. 
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Figure 9* Phylogenetic tree of amino acid sequences of 
KS and CLF subunits of actinomycete type II PKSs. For 
sources of data, see Table 1. The tree was constructed using 
the PHYLUP 3.5 package with the "Dayhoff PAM distance 
matrix" and the Fitch— Margoliash method for tree con- 
struction, and the S. glaucescens putative FAS condensing 
enzyme (Fab) 101a as outlier. Cine number of amino acid 
substitutions is proportional to the lengths of the horizontal 
lines; the lengths of the vertical lines are arbitrary.) (Figure 
kindly provided by W. P. RevilL) L r> fM v £JU*\) oJi^^M^ 

2. Role of the Minimal PKS in First Ring Cydization 

Establishing the correct fold of the nascent carbon 
chain to close the first carbocyclic ring of the aromatic 
polyketides has: long been recognized as a key step 
in controlling product structure. 102 * 103 The finding of 
SEK4, with its C-7/C-12 ring closure, as the product 
of the act minimal PKS suggested that these three 
proteins are capable of establishing the correct first 
ring cydization. This conclusion was later modulated 
following isolation of other novel metabolites made 

by minimal PKSs. 

The product of the tcm minimal PKS plus the act 
KR was initially recognized as KM20 (20), 84 and later 
two further products, RM20b (23) and KM20c (24), 
were identified. 92 All three showed C-7/C-12 first 
ring closure. When the tcm minimal PKS was 
expressed without the act KR (but with the actVTL 
and actIV genes) SEK15 (19), also with C-7/C-12 
closure, had been found 89 (see section ELE), but small 
quantities of SEKlSb (25), with C-9/014 closure, also 
occurred; and when the tcm minimal PKS was 
expressed alone, the ratio of SEK15 and SEKlSb 
became approximately 1:1. 88 In contrast, in its 
natural host, jS. glaucescens, the complete tcm 
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ARotS ° f , the ^tural actKR, 

AKU, and CYC— produces only the C-9/C-14 closure 
characteristic of tetracenomyrin (2: Kgure 3 ) 

PKslZ^^ Itl? ^ to ^ ^ea that a minimal 
SntS^ u 1 cyclization, but that this 

control can be modulated by other PKS subunits 
which need not necessarily act as enzymes on t£ 
nascent carbon chain (for example in the case of the 
ocmi andactfV proteins just mentioned), but which 
T&ZwS he ^^^ons between the subunits 
of the PKS complex and its interactions with the 
growing cham. Another example of the same phe- 
nomenon was postulated following the discovery of 
a further metabolite, SEK4b (26), produced in equal 
quantities along with the previously identified SEK4 

i y J?1^ C 4/^ im ^ PKS " 93 ^ this molecule the 
C w ^ 7 f-12 cydization is replaced by an unusual 
C-10/C-15 first^chzation of the methyl end of the 
carbon cham. Production of SEK4b was ascribed to 
uncatalyzed cyclization of this end after its prema- 
ture release from the PKS. Again, the outcome wi 
fenced by other PKS subunits:' the acTmSmS 

Xtt^ 9 ^™^* * ^ of 1:5 for SEK4 and 
bEK4b, but in the presence of the actVTL and actIV 
proteins, the proportion of the naturally folded SEK4 
was much greater. 93 miv± 

3. Aromatases and Cyclases 
: nf^TT^i ab0ve (s , ection Sadies of subsets 

^^ ^ geTLe ^ duCt ^ 311 aromatase for the first 
ring." These studies also indicated that the octW 
ffi^ Pwduct.^a second ring cyclase, catalyzing an 
SSJi- between C-5 and C-14 to yield the 
^ Bt P 3 ^ of benzoisochromane! 
SSf^f ^ a< ^ ort odin. Since then, PKS sub- 
units fro m other streptomycetes have been charac- 
tenzed as aromatases and/or cyclases. 
Several PKS gene sets, besides the art set, contain 

thesize the C-9-reduced octa-, hona-, and decaketide 
carbon chains of the benz^isocb^maneo^Stc- 
tmorhodm, granaticin, frenoUcin, and griseuS as 
^ i^e decaketides of oxyte^di^^^ 

So^ n ^ ^t^ycin, and the unknown 

aromatic polyketide of S. cinnamonensis. They tend 

£w PW , f e i reSe ? blance between the N- andC- 
tenmnal halves of the proteins, first recognized " fc 
toe /reTrexample and postulated to have arisen by 
an ancesteal gene duplication (Figure 10) » They can 
be described as "didomain" aromatases,"" preSuSu? 
wl£r the ab ^ion.of two mofe^S 

7e^ a J^ mg SUPP ° rt f0r existence of two 
r??^ domains was recently obtained by artdfi- 
ciaUy . expressmg them as separate proteins, which 
functioned together in vivo.™ In contrast the tern 

nSSSS^K g6n ?' ^ W ^° Se P-duct hL^ 
IS 1 ? 1 **** homologous with that of the actVU 
protein fused to a C-terminal Cmethyltransferase 

rf * ^S"^ - ^ ^^eN-ternnnalffSSSe 
of functaonmg by itself as a "monodomau? ^ 
matase « A homologous monodomain protein is 
ffSJiS 7 ^- OEF VI gene and its homologuel 
ui the sch and cur clusters (Figures 5 and 10) which 
are components of PKSs that probably, like theSa 
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Figure 10, Dendrogram of the dednnoii ow„ 

gene products of some presumed didomain^aromltasl 
a <^omycete type HPKSs; the]£temE£uf 
of the tetracenomycm aromatase/O-methyltraS^ and 

™» t « ht>le K Seq ?l enC f S / ,f Presumed monXmato'SS 
matase subunits of the WhiE, Sch. and fS^prSr ^ 

PKS synthesise unreduced polyketide chains, whose 
first rings would aromatize spontaneously. 

.Both tcmN and whW-OBFVI have been combined 
S5 ^t^ 1 ^^ e^e sets (ac^/ren, and 
SklE? ^ 7^ ^e act KR, and various novel 
polyketide products were identified. From their 

SSSET**.* ^ deduCe f **** ^e fcmiVprotein can 
mfluen^jJie regipspecificity<)f first ring closure of 

SSSgSdS T* reduced carbon chaL: eSu- 

Zll Wh ! 1 L^ w ^ ^ a dded to the act rninimal PKS 
•gene set," instead of SEK4 (C-7/C-9) in its absent? 
li°^ g a example of the «modSaS 

taSFSS^ How ever, the primary function for the 
Bffl^j^^^^' structures of 
- ( ?^ d EM80b (29) (both products bftiie fcm 
minimal PKS together with temN) to be^ asemnd 
nng aromatase." The hom61og6us whilt-ORFVI 
protem also influenced the regiospecificity of firstrin^ 
cychzation of unreduced but noTSced^bo^ 
chains and interestingly it could catalyze > firsTrS? 
aromatization of a reduced chain (which itrasS? 

S&&1££S$l see) to **T & noveI 

4. Flexibility of Chain-Length Determination 

A striking feature of most PKSs is the fidelity with 
rtnch thev mnM i ...i u r"?wim 



which they^ntrol thelarborTcW 

rro°m U ^- ^^ es ^ e^ons havTemer^ 
from the construction of recombinants in the S 
cveltcolor host-vector system. The first ron^r^ the 
^n mmimal PKS which, as descrSfdlbo^wal 
found to construct both octa- and nonaketides w hS 
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Figure 11. Carbon chain assembly and first ring cyclization patterns that lead to six novel polyketides catalyzed by the 
fren minimal PKS, alone or modulated by the presence of the act KR or tcmN aromatase. 96 Note that the fren miTvrnrml 
PKS alone builds only octaketides (with C-7/C-12 or aberrant C-10/C-15 ring closure), SEK4 and SEK4b; in the presence 
of the act KR it builds both octaketides (with either C-7/C-12 or C-5/C-10 ring closure), mutactin and RM18b, and a 
nonaketide (with C-7/C-12 ring closure), KM 18; and in the presence of the tcmN aromatase it builds only a nonaketide 
(with C-9/C-14 ring closure), PK8. See text for details. 



heterologously expressed along with the act KR, as 
it is presumed to do along with the fren KR in its 
natural S. roseofuluus host. 91 Later it was discovered 
that the choice of chain length could be constrained 
by other PKS subunits, in either direction (Figure 
11): the fren minimal PKS alone produced only the 
octaketides SEK4 and SEK4b, whereas addition of 
tcmN caused the (almost) exclusive production of a 
novel nonaketide, PK8 (31). 96 



The second example involves the whiE minimal 
PKS which, when cloned in the CH999 expression 
system, was found to be capable of synthesizing both 
C22 undeca- and C24 dodecaketides, TW94b (32) and 
TW94 (33). 97 In this case the chain length could be 
constrained to be exclusively C24 by addition of the 
whiE-OEFVL gene discussed above in connection with 
its role in influencing first ring cyclization and as a 
first ring aromatase. 105 This might be taken to imply 
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S^d^nf^ 31 SP °, re Pfe"""* <>f S. coelicolor, the 
produrt of the complete whiE gene set/is a dodeca- 

5. Choice' of Starter Unit 

rJ5f J^^ r M*" for the majority of aromatic 
polyketides is thought to be acetatefbut therTSe 
exceptions: the best documented is the use ofW 

SZ£l an 5 u ^H»«i like daunorubicin 
doxorubicin made by strains such as S. sp. C5 and 
S.peucetius. In other examples the situation is less 
t W K m S - "/^f ^ starter unit for oxytetracy! 
£S! i 10 , 8 ^^ b y otc PKS is commonhr 
beheved to be malonamate, but might fie malonate-^ 

£ ?f frenollC ^ (6 ' F^™ 8 3) ' m S - roseofulvus, with 
its fully reduced methylene group at C-17, there has 
been speculation that the., starter unit might be 
butyrate, obviating a need for a full cycle of reduction 
at tins carbon by. the PKS. Genetic studies havTso 
fer faded to solve the "starter unit problem" bu?W 
thrown up some observations relevant to it For 
example cloning of the otc minimal PKS (in thl 
presence of the ac* KR) in 8. coelicolor CH999 yielded 

SoftSf ^,? e ? Ved from acetate (Kgg 

M), just as the tem rnmimal PKS had done This 
showed that the otc PKS can use an acetate starte? 
unit, but left open the natural determination ofti£ ' 
correct starter lor snarly, nonaketides derived 
from nine acetate residues were produced by^he 
mmimal^ n PKS inS. coelicolor, as deseed abJve 

(Figure 5) includes a gene (dpsC) whose product 
resembles the 2?. cofi ketosynthase HI, spedfic for S 
first condensation in fatty acid biosvnthe^is ari^f 
second gene (dpsD) whoS prodSKs seance 
analanfaes with acyl transferases, sugglsti^that 
ti^y might be involved in the choice oflL^romS 

the SSfE ^ but gene was requ^edfor 

.P r °duce polyketide chains starting with 

BSSSf 8 ""/if 5 coeZ£co ^ expression system?™ 
Elucidation of starter unit choice for the initiation 

whe^rS, C P ol ^ e carbon chain aSexnbfy? Sd 
of m^v mi ^ tl0n . m ^ ht sha ^ ^ps with that 

£££?V1 cl ^ ^ 8ame or ^m (see 

SKSmSS* ^? 5™ bal * r re q uire a more biochemi- 
cal approach, aided by genetics, for its solution. 

6. Predictions about Biosynthesis and Structure 

of K^S CC > d be o 6fit °f clonmgrecombinant sets 
of PKS subunits in 8. coelicolor has been its use to 

wHch h \° qUestions m BituS £ 

which chemical approaches had proven difficult 

One example concerns the spore pigments of Streu- 

P6 ? S ' . Which W « S K attempteto 
^olatethem for chemical characterization. Cloning 

SqTTf^^ scA > ^ d cur minimal 
PKS gene sets from S. coelicolor, S. halstedUaad 

S"3£ T :T^ ely ' ^ ° r ^ combinaSw^ 
cL^iSf^ g f neS ' * S - strain 
ffi f 1 !' 8 established their products as 

<-2* and (at least in this artificial situation^ r 
Polyketides, even though the complete s^u^sTf 

fflLl? wW*?* 8 are unkrW^ 

SunUarly, the finding ofdecaketide produ^ fbTthe 
cloned mon minimal PKS suggests that themdmow^ 



' Hopwood 

aromatic^metabolite of S. cinnamonensis which it 
presumably makes is itself a decaketide 

of nnW e ^ eXami> t concerns the aureolic acid group 
of polyketides, such as mithramycin (12 FimSe ^ 
produced bjnS^pto^yces argiUoleus, forwK 
C n i r °t Sf0r t Carb0n chain' assembly Sbeln 

diketide chains, or direct assembly of a decaketide 
Clomng of the mtm minimal PKS in S. 
the Presence of the act KR produced the dreadv 
identified decaketides RM20b (23) and BM20c3Jf 
clearly ^lishingthat the mtm PKS can^esS 
&?«^.^« and S " gge / ti ^ this as the most plausible 
SSf* 6 ° ngmofthe mithramycin carbon b3£ 

'7. 0es/£/7 /fofes fo/- Afore/ >4ro/na/fc Polyketides 

As described aboveVanalysis of the polyketides 
produced by many recbrnbinarits canyii ffiS 
combinations of PKS subunits ledtoT series of 
conclusions about PKS programming. TbS rSro- 
spective analysis could be conveHedinto ama- 
tive set of "design rules" ^with enough pre^cti^e 
powers generate novel mole^es ^orde? and It 
SEXIST ^ ^ SEK43 (sSan? 

St? 38 we » 38 PK 8 already mentioned.^ 
These rules are summarized in Table 2. 



G. In Vitro Studies 

Although much has clearly been learned ahm,t rt,* 
responsibility of different l&iSbSfiSSSl 
a35S f P^ar 1 **' ^ e mechanisms JvoTve^ 
-SKSSft maa T\- No* *** genetic stuSs 
nave set the scene, biochemical approaches often 
making use of geneticany engineerSl s£, ^ 
make an mcreasmgly important contributioT Sis- 
nuacant steps toward the development of fuSy col 
trolled in vifro systems have been taken byXe 
expression of components of ininimal PKSs inE 
coh: the tern KS, CLF, and ACP.i^ anf ^05' 

S' ° fc ' ^ d ^ ACPs114 A «olntion struSu-rof 
the recombinant act ACP bv NMT? 

represents the first suX sSJfffKvS 

component.^"* A ground-breaS^step to ward ^ 

goal of reconstituting in uifro polyketide^yJSesis 

was taken by Shen and HufehinW™ whoTtained 

synthesis of tetracenomycin F2, « tetra™o2y^ C 

precursor^om added acetyl and malonyl CoaS a 

strain m whichtiie tcm KS, CLF, ACP, and iS- 
matase/cyclase (TcmN) PKS" subunits were over- 
expressed. When the ACP m &e tort Vas bi^- 
chenncallydepleted, synthesis was ^much reduced^ 
expected, but could be restored by adSSbadk 
recombinant tcm ACP made inE. co/i. Morererontiv 
recombmant TcinN protein was added 
extract prepared from a strain lacking it, causrn^ 

SEK15.U8 Usmg cell-free preparations from 8- co- 
e^ofor recombinants, Carrerai ^ aZ. obtained S vS- 

CoA when the minimal PKS was present, andDl^C 
when the set of genes included not onirthosefor 

and CYC.i" All these results are in striking agree- 
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Table 2. Design Ru les for the Biosynthesis of Novel Aromatic Polyketides 

structural feature design rules and comments 

carbon' chain length controlled by the niinimal PKS (KS+CLF+ACP) 

ACPs are interchangeable (at least in the range C16-C24) 

the CLF is crucial to correct chain length, but heterologous KS-hCLF 

combinations are often nonfunctional; therefore, choose homologous 

KS+CLF pairs for engineering novel molecules 
in the (unusual) cases of relaxed chain length control (fren, whiE), other 

PKS subunits can influence the choice (by one chain-extender unit): 

for example the fren minimal PKS alone made only Cie chains, with 

fcmN it made only Cis chains, and with the act KR it made a mixture 
CLFs presumably have a common ancestry with KSs and have diverged 

more from each other and from the common progenitor 
homologous KS and CLF may form a heterodimer 
ketoreduction a specific ketoreductase is required 

the act KR (the only KR to have been studied in detail) works on chains 

of at least C16-C24 

the act KR normally reduces at C-9 (occasionally at C-7); probably most 
other KRs have similar regiospecificity because their cognate natural 
products show C-9 ketoreduction 

of the sequenced PKS gene clusters, tcm, whiE, sch, and cur lack a KR 
first ring cyclization can be controlled by the minimal PKS, but regiospecificity can be influenced 

by other PKS subunits; for example the tcm minimal PKS alone made a 
mixture of C-7/C-12 and C-9/C-14 cyclized compounds, but in presence of 
act KR, ARO and CYC, C-9/C-14 cyclized compounds arose, and in 
S. glaucescens, only C-9/C-14 cyclized compounds; tcmN caused the act 
minimal PKS to form C-9/C-14 cyclized products instead of C-7/C-12 

in the presence of the act KR (and presumably others), regiospecificity of 
first ring cyclization depends on the position of ketoreduction; C-7/C-12 
for C-9 reduction, C-5/C-10 for C-7 reduction 
first ring aromatization for unreduced molecules, this is uncatalyzed 

for reduced molecules, needs an aroma tase; AROs like the actVTL 

homologues are internally duplicated ("didomain* proteins), perhaps 
reflecting the need to extract two molecules of water; they are specific 
. . ; for chain length: 98 gris ARO works on C20, Cis, Cig; fren ARO works 

on Cis, Cie; act ARO works on C16 
second ring cyclization needs an appropriate cyclase (such as the actIV protein); such CYCs show 

some chain length specificity: e.g., act CYC works on C16 and Cis but 
not C20 chains; presumably CYCs for such longer chains remain to be 
discovered 

for sufficiently long unreduced chains with C-9/C^14 first ring, 

the minimal PKS (e.g., tcm) catalyzes second ring cyclization (C-7/C-16) 
second ring aromatization a "monodomain" ARO, the tcmN protein, is responsible for aromatization 

of the second ring of unreduced chains, for which the first ring 
aromatizes spontaneously 

further cyclizations the experiments performed so far with recombinant PKSs have not 

included proteins needed for natural cyclization reactions beyond 
the second ring, and often only the first; in these experiments, the 
free chain ends, when long enough, cyclize spontaneously by 
"chemical" rules of the kind previously indicated by biomimetic 
studies: 111 ; methyl ends give hemiketals and benzene rings; carboxyl 
ends give pyrones and may be decarboxylated if a free ^-carboxyl 
exists; alternatively, aldol condensations may occur 6eft^en ends, giving 
further carbocyclic rings 

choice of starter unit there is little information on this; the dps minimal PKS could make 

the correct choice of propionate starter in S. coelicblor, but the otc 
minimal PKS used acetate instead of the presumed natural starter 



ment- with the in vivo findings described above and 
augur very well for the further development of fully 
in vitro PKS systems and their use to elucidate PKS 
mechanisms. 

///. Modular Polyketide Synthases for Macrolide 
Biosynthesis 

As mentioned in section I, the programming mech- 
anism for the complex or reduced polyketides, rep- 
resented by the aglycons of the various classes of 
macrolides, is quite different from that for the 
aromatic polyketides. The first revelations about the 
modular structure of a macrolide PKS came from 
studies of erythromycin biosynthesis in Saccha- 
ropolyspora (formerly Streptomyces) erythraea, and 



this remains the type system for this class of PKSs. 
In this section, I briefly review the isolation of the 
genes for the erythromycin PIGS and then touch on 
the genetics of some other macrolide PKSs (Table 3), 
before examining the evidence for and implications 
of the programming model itself. 

A. Cloning and Sequencing of the Erythromycin 
PKS Genes 

In 1982, Thompson et aZ., 120 as part of a study 
aimed at the isolation of a series of Strep tomyces self- 
resistance genes, described the cloning from Sac. 
erythraea of a DNA fragment that conferred resis- 
tance to erythromycin (36: Figure 12) on S. lividans. 
A single clone, pIJ23, was obtained, but it yielded so 
little DNA that the resistance gene (later named 
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Table 3. Cloning of Genes for Modular PKSs from Bacteria 



Hopwood 



Host 



polyketide 
(agiycon) 



PKS 
genes 



cloning strategy 



Sac, erythraea ery^romycin (36) ery resistance, followed by 

(6-deoxyerythronolide walking and 



evidence for 
cloning of 
correct genes 0 



module and gene 
organization 6 
(nucleotide sequence 
accession numbers) 



refCs) 



S. fradiae 



B) (37) 



tylosin (38) (tylactone) 



1,2,3,4,5 [S+2]+[23+[2+TE] 



S. ambofaciens spiramycin (39) 
(platenolide) 



S. thermotolerans carbomycin 

(platenolide) 

S. antibioticus Oleandomycin (40) 
(oleandolide) 



S. avermitffis avermectins (41) 



S. cyanogriseus nemadectin 



S. sp.FR-008 FR-008 (candicidin 
agiycon) (42) 

S. hygroscopicus rapamycin (43) 



S. sp. MA 6548 FK506 (44) 



Sorar&ium Soraphen A (45) 

cellidosum 



complementation 



tyl reverse genetics for 
a tailoring step, 
followed by walking 
and complementation 

srm resistance, followed by 
walking and .' 
complementation 



car resistance, followed by 

walking and 
.. complementation 
ole ery and resistance 

gene probes 



avr complementation of 
blocked mutants, : 
followed by walking 



nem avr probes . 



paba-synthase 
followed by 
ery probes . 
rap ery probes. 



fkb reverse genetics for 
a tailoring enzyme, 
followed by walking 
and sequencing 
gra (Type H PKS) 
probe 



1,2 



1,5 



genetic order 
colinear with 
functional order 
(X56107, M63676-7) 



120,123-132 



136,137,140 



1,2 



2 

2,5 



genetic order colinear 
with functional order? 
(Not deposited) 

[S+21+W^+tlJ+Ci-hTE] 140,141 

genetic order 

colinear with functional 

order? (Not deposited?) 

142 

[?J+[2+TE] 143 
only one protein with 
two modules sequenced 
(L09654) 

2x[3]+? 144-146 
genetic order 
not colinear with 
functional order; 
two converging 
sets of modules 

[?] « ^ 146 

genetic order not 
colinear with 
functional order: 
two converging 
sets of modules 

150 



[6]+[4]-f£43 147 

genetic order not 
colinear with 
functional order 
(X86780) 

M+? 148-149 
only one gene with 
four modules sequenced 

only one module and part of a 151 
second sequenced 
(U24241) 



module, and terminating domain, le^l l?^ 81 ?' ^ ment of module number, starting 

modular content of protein sab^^^^^^^J^F^^i by chemical structure. * Square brackets sho^f 
and arrows show the direction of trans^ptiori 'of^ gne£ ' ™ <^ymg a starter module and two chain extender modules), 

ermE) had to be rescued by subcloning to yield pIJ43 
This plasmid had a difficult birth-apart from being 
very nearly lost, it suffered some rearrangements of 
vector sequences, and lacked the native ermE pro- 
moter; but i4 turned out to have a bright future. 
Ine plasmid was dispatched to B. Weisblum for 
sequencing of the ermE gene, 122 and it, or a derivative 
of i£ was sent to the laboratories of C. R. Hutchinson, 
F. Leadlay, L. Katz, and R. H. Baltz, where the 
ermh, gene was used as a hybridization probe to clone 
genes for erythromycin biosynthesis from the genome 
of Sac. erythraea. DNA fragments isolated by ge- 
nomic waUring from ermE were sequenced, and used 
in gene disruption and complementation experi- 
ments, by these laboratories. In this way, numerous 
genes encoding tailoring steps in erythromycin bio- 
syntihLesis were found. One lay on the 5' side of ermE 
Cto the left as the gene cluster is conventionally 



drawn), and many on the 3' (right) side. 123 - 127 A key 
step was the identification of a segment of DNA that 
complemented the eryA class of mutations, which 
were blocked in biosynthesis of the agiycon 
(polyketide) moiety of erythromycin (6-deoxyeryth- 
ronolide B: 37), some 12 kb downstream of ermE- 128 
even more significantly, as it turned out, another 
segment of DNA that hybridized to the one that 
complemented the eryA mutations was found some 
35 kb downstream of ermE, implying that eryA was 
™ ocus coverm g a considerable stretch of 
DNA. 129 The stage was thus set for the sequencing 
of eryA in the .Leadlay and Katz laboratories. The 
results were spectacular the sequence revealed 
three unusually large genes, each encoding a protein 
carrying two modules of PKS active sites, with each 
module resembling in its sequence and organization 
a vertebrate fatty acid synthase; 130 -! 32 these are the 
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Figure 12. The modular erythromycin PKS and the "assembly-line" model of biosynthesis leading to 6-deoxyerythronolide 
B, and on to erythromycin A. 



three proteins that are now known as 6-deoxyeryth- 
ronolide B synthase (DEBS) 1, 2, and 3 (Figure 12). 
The three genes were seen to be arranged in the same 
order as the proposed sequence of action of the six 
modules of active sites that they encoded: DEBS1, 
carrying a short starter module and chain extender 
modules 1 and 2; followed by DEBS2, carrying 
modules 3 and 4; and then by DEBS3, carrying 
modules 5 and 6 together with a final thioesterase 
domain for carbon chain release. The existence of 
these three hypothetical proteins was confirmed in 
a biochemical tour-de-force in which they were re- 
vealed by Western blotting and N-terminal sequenc- 
ing. 133 

B. Genes for Other Macrolide PKSs 

/. Tylosin 

Tylosin (38: Figure 13), produced by Streptomyces 
fradiae, was one of the first antibiotics for which a 



comprehensive set of blocked mutants was isolated. 
: These were used to help to define the biosynthetic 
pathway 134 and to adumbrate the potential for genetic 
engineering in manipulating it, either to increase the 
productivity of the fermentation or to generate novel 
antibiotics. 135 Whereas erythromycin is a 14-mem- 
bered macrolide, tylosin has a 16-membered ring 
structure. The gene (tylF) for the final biosynthetic 
step — O-methylation of the tylosin precursor, macro- 
cin — was cloned by reverse genetics from the protein 
sequence of the enzyme, and specific segments of 
surrounding DNA were found to complement eight 
other classes of blocked mutants, but not the tylG 
mutants that were presumed to be defective in the 
tylosin PKS. 136 Soon the genetic and physical map 
of the cluster was extended to cover a ~90 kb 
segment bounded by two resistance genes. 137 It 
included DNA segments that complemented nearly 
all the known classes of blocked mutants, but only 
one member of a large set of tylG mutants. This 
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!SS segment was adjacent to a -30 kb stretch of 
that included a series of (imperfect) direct 
t P ^f ^,T^s Seginent ' na med an Amplifiable Unit 
of DNA (AUD) that was subject to repeated ampU- 
ncation and deletion events caused presumably bv 
recombination involving the repeats. 138139 Some of 
these deletions and amplifications had earlier been 
assoaatedwith a^ZG phenotype. This mysterious 
S a S"75 ldl -? ot surprisingly puzzled and frus- 
trated the Lilly group for several years-was clarified 
as soon as the modular structure of the erythromycin 
h^S T announce , d: 30 kb stretch of the tylosin 
biosynthetic gene cluster represented a complex tylG 
locus encoding a modular tylosin PKS, and recombi- 



nation between the conserved sequences in the vari- 
ous modules was the likely explanation for the 
observed amplification/deletion events. Why this has 

?fS f ° l T d I f0r t i ie "responding eryA locus 
(or those for other cloned modular PKSs) is unknown 
Perhaps the/y/G modules retain more DNA sequence 
similarity than the corresponding eryA modules; or 
perhaps the extent of nucleotide sequence identity 
required for homologous recombination in S. fradiae 
« t ^f* m e ^Wa, so that the -recombina- 
tion between different modules that would tritnrer 
amplification occurs at a noticeable frequency in the 

i^ 6r u^ but not m the ^"er. Unfortunately 
although the sequences of the tylosin PKS genes have 
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apparently been determined by the Lilly group 
(referred to in ref 140), they are not yet in the public 
domain. / 

2. Spiramycin and Carbomycin 

Spiramycin (39) and carbomycin, produced by 
Streptomyces ambofaciens and Streptomyces thermo- 
tolerans respectively, are two further 16-membered 
macrolide antibiotics both derived from the same 
primary polyketide metabolite, platenolide. For both, 
cloning of one or more resistance genes in a heter- 
ologous Streptomyces host led to identification of 
linked biosynthetic genes that -were revealed by 
complementation of blocked mutants with segments 
of the cloned DNA 141 ' 142 Complete PKS gene se- 
quences, at least for the spiramycin PKS (srmG), 
have been obtained by the Lilly group. The se- 
quences themselves do not appear to be in the public 
domain but a summary of the deduced modular 
structure of the srmG-encoded PKS has been pub- 
lished. 140 The seven modules needed to -assemble the 
octaketide are distributed over five PKS subunits: 
two bimodular proteins (like the three DEBS pro- 
teins), the first including a starter module, and three 
urrimodular proteins, the last carrying a thioesterase 
release domain (Table 3). 

3. Oleandomycin 

One open reading frame that would encode a 
protein carrying two modules of a PKS was cloned 
from the producer of another 14-membered macrolide 
aglycon, that of oleandomycin (40) produced by 
Streptomyces antibioticus, by. the use of a DEBS3 
PKS. probe and an oleandomycin resistance gene 
probe. 143 No definite proof of its involvement in 
oleandomycin biosynthesis was obtained, but the 
close linkage of a specific resistance gene is sugges- 
tive, and the presence of a thioesterase domain at 
the C-terminus of the presumptive product of the 
gene is consistent with its encoding the .final two 
modules of the presumably six-module oleandomycin 
PKS. 

4. Avermectin and Nemadectin 

A biosynthetic gene cluster for the avermectina 
(41), important antiparasitic polyketide macrocyclic 
lactones produced by Streptomyces avermitilis, was 
isolated by complementing mutants blocked in tailor- 
ing steps of the biosynthetic pathway: for O-meth- 
ylation and glycosylation of the polyketide moiety. 144 
Chromosome walking between the two complement- 
ing regions, and gene disruptions, identified a 165 
kb region containing the avr gene cluster. Centrally 
located in the cluster, DNA encoding a modular PKS 
was found by hybridization and limited DNA se- 
quencing. 145 Prom this it was deduced that the genes 
for 12 PKS modules needed to assemble the aver- 
mectin aglycon are organized differently from those 
encoding the DEBS modules in two respects: (1) the 
genes are not colinear with the functional order of 
action of the proteins, but instead the DNA encoding 
the 12 modules forms two converging sequences, each 
apparently encoding six modules; and (2) two of the 
genes appeared to encode proteins each carrying 
three modules, instead of each encoding two modules 
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as in DEBS (distribution of the remaining modules 
is unknown). 146 Use of avr PKS sequences as probes 
against DNA of Streptomyces cyanogriseus, the pro- 
ducer of nemadectin, with the same aglycon structure 
as avermectin, led to the isolation of genes (nem) 
encoding another modular PKS, and this was proved 
by gene disruption to be involved in nemadectin 
biosynthesis. 146 Hopefully, sequence information on 
the avr and nem PKS genes will become available in 
due course. 

5. Rapamycin and FK506 

Rapamycin (42) is an important immunosuppres- 
sant made by Streptomyces hygroscopicus. The PKS 
genes, and most or all of the genes encoding tailoring 
steps in rapamycin biosynthesis, were cloned by the 
use of DEBS PKS probes, followed by chromosome 
walking and sequencing of 107 kb of DNA. 14 '' The 
PKS consists of three RAPS proteins carrying the 
active sites needed for 14 condensation cycles; one 
protein consists of six modules and the two others 
contain four modules each. In contrast to the eryth- - 
romycin case, the genetic order is not the same as 
the functional order of activity of the three proteins 
(Table 3), and typical starter and thioesterase do- 
mains are not seen, consistent with the different 
structures of the rapamycin and erythromycin poly- 
ketides: rapamycin starts with incorporation of a 
cyclohexane carboxylic acid unit and ends with 
incorporation of a pipecolic acid unit. Gene disrup- 
tion has been used to prove the involvement of these 
genes (but not of two other. DNA regions from the 
same S. hygroscopicus strain, each presumptively 
encoding a modular PKS) in rapamycin biosyn- 
thesis. 14 ^ 

FK506 (43) is another important immunosuppres- 
sant, made by Streptomyces sp. MA 6548. Reverse 
genetics led to the cloning of a gene for a tailoring 
enzyme, an O-methyltran sferase, which was shown, 
by gene disruption to be essential for FK506 biosyn- 
thesis. 148 Nearby, sequencing revealed a gene for a 
cytochrome P450 hydroxylase, responsible for an- 
other biosynthetic step, 148 and a larjge open reading 
frame which would encode a modular PKS subunit 
carrying four , modules pf active sites, and closely 
resembling one of the RAPS proteins. 149 

6. Candicidin 

Candicidin (44) is a member of the polyene mac- 
rolide class of complex antifungal polyketides. A 
related compound, FR-008, with an identical aglycon 
to that of candicidin, is made by Streptomyces sp. FR- 
008. A gene cluster involved in FR-008 biosynthesis 
was isolated by hybridization, initially using as probe 
a gene involved in biosynthesis of the j>-aminobenzoic 
acid-derived starter unit; later the DEBS2 gene was 
used as a PKS probe, and finally probes specific for 
ACP,KS and AT motifs from the DEBS PKS. 150 The 
hybridization patterns with these last probes re- 
vealed that DNA encoding a modular PKS extended 
over 105 kb. This would be an appropriate length to 
encode 21 PKS modules of average length ^5 kb, and 
this is the number of condensations required for 
synthesis of the FR-008 aglycon. The finding is 
significant in implying a one-to-one relationship 
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between modules and rounds of condensation, even 
in a polyene with a high degree of chemical repetition 
represented by the seven successive double bonds all 
associated with the insertion of acetate residues. A 
priori, this part of the polyketide might have been 
assembled by the iterative activity of a single module, 
but this result makes this idea unlikely. 

7. Soraphen 

Soraphen A (45), produced by a myxobacterium, 
Sorangium cellulosum, is made by the first example 
of a functional modular PKS so far known outside of 
the actinomycetes. Interestingly, it was possible to 
clone DNA encoding part of the PKS by the use of a 
probe encoding part of one of the type II, nonmodular 
, PKSs, for granaticin biosynthesis (see section II). 151 
Gene disruption proved the in vol vement of the cloned 
DNA in soraphen biosynthesis, and sequencing has 
so far revealed part of a gene encoding one complete 
module of PKS active sites and an incomplete second 
module. 

C. The Programming Model and Its Proof by 
Mutant and Recombinant Construction 

Initial evidence for the assembly-line model for 
programming of the erythromycin PKS (Figure 12) 
was provided by the sequence itself. 131 Not only did 
.the six modules of putative catalytic sites correspond 
in number to the six condensations needed to build 
the erythromycin heptaketide, but special features 
of specific modules could be related to their proposed 
functions: DEBS1 had extra N-terminal AT and ACP 
domains, before module 1, which would function in 
the loading of the propionyl CoA starter unit (only 
later was this part of the protein designated as a 
separate "starter' 9 or loading'' module), and DEBS3 
was unique in carrying a putative thioesterase do- 
main after module 6 for hydrolysis of the final 
thioester bond between the completed polyketide 
chain and the 4'-phosphopantetheine prosthetic group 
of the last ACP domain to release the carbon chai V : 
module 3 lacked all three reductive functions (KR, 
DH, and ER), agreeing with the presence of an 
unreduced keto group after the third condensation, 
while module 4 was unique in carrying candidate 
domains for all three such functions, as expected in 
view of the reduction of the keto group right through 
to a methylene after the fourth condensation: * Ex- 
perimental evidence was soon provided by two suc- 
cessful domain inactivation experiments: when the 
active site of the putative KR in module 5 was 
deleted, a polyketide with an unreduced keto group 
after the fifth condensation (46: Figure 14) was 
isolated; 131 and when the putative ER in module 4 
was mutated, a double bond appeared in the final 
product (47), as expected from a failure of ehoyl 
reduction after the fourth condensation. 152 

These results were indeed impressive in establish- 
ing the assembly line modeL Further strong support 
for the model was provided by the construction of 
recombinants carrying reduced numbers of DEBS 
modules, which were found to produce various 
rnmilactones'' of appropriate structure. One set of 
experiments (here referred to as the "Stanford ex- 
periments", but involving a collaboration of the 
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Khosla with the Katz and Cane laboratories) stemmed 
from the remarkable feat of engineering expression 
of the entire set of DEBS proteins in the host-vector 
system (S. coelicolor CH999/pRM5) that had already 
proved so successful for analyzing type II PKS 
structure-function relations (section II): the result- 
ing recombinant produced significant quantities of 
6-deoxyerythronolide B, together with an analogue 
with an acetate starter (48: Figure 14). 153 When the 
DEBS1 protein was expressed alone, a triketide 
lactone (49) arose, 154 which was precisely the ex- 
pected product of modules 1 and 2 (and had earlier 
been reported as a minor product in the mutant 
deleted for the KR of module 5: 155 it increased in 
quantity when the thioesterase domain, which nor- 
mally forms the C^terrninus of DEBS3, was added to 
DEBS1. 156 This construct (DEBSl-fTE) was the one 
that had already been engineered in exchange for the 
normal DEBS1 in a Sac. erythraea host deleted for 
DEBS2 and most of DEBS3 in the "Cambridge 
experiments", which also yielded the triketide lactone 
(49). 157 (In both the Stanford and the Cambridge 
experiments, recombinants carrying DEBSl-fTE in 
the S. coelicolor surrogate host also produced an 
unnatural triketide lactone (50) with an acetate 
instead of a propionate starter. 156 - 158 ) The production 
of tetraketide (51 and 52) and hexaketide (53) lac- 
tones by S. coelicolor recombinants carrying modules 
1-3 or 1-5 (together with the TE) provided further 
evidence for the programming model, 156 - 159 and for the 
remarkable degree of functional independence of 
which the various modules are capable. 

Most recently, success has been achieved in at- 
tempts to generate unnatural natural products by 
domain replacement rather than deletion or mu- 
tagenesis. In one of these, advantage was taken of 
the key finding, from the sequencing of the rapamy- 
cin PKS genes, of a difference in the consensus 
sequence for the seven AT domains that would 
introduce acetate extender units (from malonyl CoA) 
from that of the seven domains that would introduce 
propionate extenders (from methylmalonyl CoA) 157 
(the latter consensus was also shared by the AT 
domains in DEBS, which handle methylmalonate) 
Again exploiting the CH999/pBM5 expression sys- 
tem, the methylmalonyl transferase domain of DEBSl 
module 1 -was replaced by the jnalonyl transferase 
domain of RAPS1 module 2, to generate a functional 
PKS that produced novel triketide lactones (54 and 
55) of the predicted structure, with incorporation of 
an acetate instead of a propionate residue as the first 
chain-extender unit. 160 In a second example of 
domain swapping, the starter module of the spira- 
mycin aglycpn PKS, which makes platenolide in S. 
ambofaciens by incorporating an acetyl unit, was 
replaced by the starter module for the S. fradiae 
tylactone PKS, which normally incorporates a pro- 
pionyl starter: the result , was the predicted methyl 
platenolide. 140 Another example of $tarter module 
swapping involved exchange of the starter module of 
DEBS1+TE (cloned in S. coelicolor) by the starter 
domain of the avermectin PKS; the recombinant 
made the predicted triketide lactones that start with 
the branched-chain residues characteristic of the 
natural avermectins (41: Figure 13); 161 Finally a 
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very important development was the addition of a 
dehydatase domain (from the rapamycin PKS) to 
DEBS1 module 2, to yield a product with a "gain of 
function" double bond in the expected position. 161a 

D. In Vitro Studies and Model Building 

Genetic approaches have been crucial in deducing 
the basic features of the programming rules for the 
modular type I PKSs, but understanding how these 
rules are interpreted by. the cellular machinery will 
have an increasingly large biochemical input. Some 
very significant steps have already been taken. For 



example, in a beautiful in vitro experiment with 
purified DEBS1, DEBS2, and DEBS3 proteins, it was 
found that only the 25, and not the 2R stereoisomer 
of 14 C-labeled methylmalonyl CoA was attached to 
all the AT sites of the PKS subunits, providing 
evidence for the use of only the 2S stereoisomer in 
chain extension (presumably with racemization after 
incorporation of the second, fifth, and sixth extension 
units to give the R configuration found in 6-deoxy- 
erythronolide B), rather than selective use of (2S)- 
and ( 2R )-methylmalonyl CoA as chain extenders. 162 
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Both the Stanford and the Cambridge groups have 
taken the important step of developing active in vitro 
systems using recombinant DEBS proteins: these 

wee i^? 1618 ' natural, three-component 
PKS 163 or the DEBSl+TE engineered protein. 163 "^ 
Interestingly, the recombinant enzymes,^ "in vitro 
showed considerable permissiveness for starter units' 
extending beyond just propionyl CoA or acetyl CoA. 1 ** 
Biochemical and genetic studies have also begun 
todefine important features of the three-dimensional 
structure of the native DEBS protein. A very inter- 
esting model for this, proposed by the Cambridge 
group largely from the, results of rigorous chemical 
cross-linking experiments, is a double-helical ho- 
modimer m which the domains found in all of the 
modul es (KS, AT, and ACP) form the core of the helix 
andthe reductive domains form loops of various 
ro^SfS^ on the Presence of KR, KR+DH, 
domains in a specific module ^ 7 An 
attractive feature of the model is its overall head-to- 
nead/tail-to-tail organization, which would nicely 
accommodate the additions and/or deletions of mod- 
ules that are a logical consequence of the presumed 
evolutionary relatedness of the various modular 
PKSs (section VI), A model proposed by the Stanford 
group, from mutant complementation experiments 
also has an overall head-to-head/tail-to-tail dimeric 
structure but within each module the two identical 
polypeptide chains would lie head-to-tail 168 This 
arrangement, in which there would betwo equivalent 
clusters of active sites in each module, with each 
P ro *f m subunit contributing certain sites to each 
cluster (for example the KS to one and the ACP to 
the other), not only arises elegantly from the comple- 
mentation results, but is reminiscent of the head-to- 
tail homooWric structure of the ("unimodular") 
vertebrate fatty acid synthase, 1 ®.™ to which the PKS 
S^Jf doubti .es f Phylogenetically related (see 
section VI). Especially in view of the fact that the 
helical aspect.of the Cambridge model is arbitrary 167 
^oloSca^ 618 * ^ct essentially equivalent 

IV. 6-Methylsalicylic Acid Synthase and Other 
Fungal Polyketide Synthases 

^ff^K^ 0 , ac / d ^thase (6-MSAS) is the 
classical PKS which has been studied from -a bio- 
chemical standpoint ever since synthesis- of 6-MSA 
(56. Figure 15) from acetyl CoA and malonyl CoA 
I L^f^* 6XtraCt P^icUliumpatulum was 
tery. The gene was cloned by screening an expres- 

against the purified protein.^ Sequencing revved 
a single open reading frame (mterrupted by a short 
SSL^°1 s^^y, the four acbvS 

ATra^S ^f^ e n<* of a* encoded protem (KS, 

^^A^ P P)re ^ bled&eMwes P° n ^ sites 
in rat FAS and were colinear in the two^thases. 1 ™ 
This provided early evidence for a presumptive 
»S gen ^ ^ b J e t ween eukaryotic PKS and PAS 
genes, and detailed biochemical similarities tended 
to support this idea. 1 ™ Apart from fiurusbin^cial 
n^or^tion on the primary structure ofte ™ 
genetic contributions to understanding 6-MSAS func- 
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tion are so far few. They should be aided by the 

3S?££?f Sfr?* g6ne m S - collar 
UrI99S i host, m which it gave rise to significant 
quantities of 6-MSA. 174 "gnmcant 

- Genes for several other type I PKSs from filamen- 
tous fungi have subsequently been cloned and se- 

pigment of unknown structure in Aspergillus nidu- 
Ums-™ the decaketide norsolorinic acid (57) 
mtermecnate of afiatoxin biosynthesis inAspereillus 
pa^cus-,™-™ the pentaketide pmanSoAeS 
te^ydroxynaphthalene: 68) of fungal melanfis'm 
Alternaria alternate™ and Colletotrichum lagenar- 
mm; andthenonaketidemevmolinaovastatm)(59) 
S^KS? 1 * ^j"- 181 , The deduced products of 
these PKS genes like 6-MSAS, all show an overaU 
structural resemblance to the vertebrate FAS with 
abS fu nCeS ? f P^cular domains. 'Thus 
? G ^^ff 68 H"* a C-terminal thioesterase 
domain (presumably release of the carbon chain from 
the synthase occurs by a mechanism different from 

hn^ P ?^ yaed of tne final thioester 

bond) and all except the mevinolin PKS lack the 
domains for the three reductive functions, which are 
ketid£ U0Ufi bios y*thesis of unreduced pol^ 

Otiier features of interest have been noted in 
particular cases: for example a methyltransferase 

nar^t£ e S d i° r ^»*M»*m> forms^aSt 
part of the structure of the mevinolin PKS. A feature 
of great genetic interest is the finding of close linkage 

of ^ gen A S Va ? OUS the biosynthesis 

of these fungal metabolites; the most striking <S 
ample is a cluster of 25 coregulated genesfb?bio- 
synthesis of tt e aflatoxin-related metlbonte £eri£ 
matocystm (elaborated from norsolorinic acid)^nl 

JnSh"-" SuA ^ u ^^wHchischarSerStic 
of both primary and secondary metabolites in bacte- 

s^fss**** for primary meteboiic 

tJ?^^?!^ reC ? nt tam of ^ents, the afla- 
^ } f^™™ A- Parasiticus and 
A.nidulans have been found to encode two polypep- 
tide chains typical of a fungal FAS, in which the 
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catalytic domains occur in a different order from 
those of the vertebrate FAS (Figure 2). 183 - 185 This 
presumed J?AS is proposed to be dedicated to syn- 
thesis of a hexanoyl residue which, rather than 
acetate, would prime norsolorinic acid biosynthesis, 
and is distinct from the FAS used to make the lipids 
of the cell 185 Perhaps this can be claimed as another 
example of genetic analysis helping to solve a prob- 
lem in bioorganic cheniistry—revealing that nbrso- 
lorinic acid would be an octaketide rather than a 
decaketide 186 — to put with those arising from the 
manipulation of type II aromatic PKS genes (section 
n.F.6). 

V. The Chalcone and Stilbene Synthase 
Superf amity from Higher Plants 

Chalcone synthase (CHS) is a PKS by .definition : 
because it catalyzes the linking of acyl CoA subunits 
by repetitive decarboxylative condensations that is 
the hallmark of polyketide and fatty acid synthesis. 
However, there are significant differences in the 
biochemistry of the process, notably the participation 
of free CoA esters as substrates without the involve- .■ 
ment of 4'-phosphopantetheine arms carried on acyl 
carrier proteins; and sequence comparisons suggest 
that CHS is phylogenetically distinct from all other 
groups of PKSs and all known FASs, with no signifi- 
cant overall resemblance of amino acid sequence, and 
a different motif surrounding the active-site cys- 
teine. 187 CHS is a remarkable enzyme because, as a 
homodimer of a modest-sized polypeptide rh«iTi (only 
43 kd), it selects coumaroyl CoA as starter, carries 
out three successive extensions using malonyl CoA 
(with no reduction of yS-keto groups), and released the 
resulting tetraketide to cyclize to naringenin chalcone 
(60: Figure 16), the precursor for tiie anthocyanin 
pigments and other flayonoids of plants! 

The CHS gene from parsley (Petrvselinum hortense) 
was the first to be cloned, . as a cDNA by taking 
advantage of the abundance, of its transcript in 
cultured ceils. 188 It can therefore be regarded as the 
founding member of what has now grown to be a 
large superfamily of genes for these specialized PKSs 
found, so far exclusively, in higher, plants. The family 
includes not only CHSs from many Angiosperm - 
families and from (Jymnqsperms, but : also related 
enzymes, the stilbene synthases (STSs), which share 
more than 65% amino acid sequence identity with 
CHSs, and build the same nascent tetraketide, but 
release it with decarboxylation and a different fold 
compared with chalcone to generate stilbene (61) 
(Figure 16). 189 Very interestingly from a genetic 
standpoint, a phylogenetic analysis of more than 30 
CHS and several STS sequences showed a greater 
resemblance between CHS and STS sequences from 
related plants than among CHSs or STSs, suggesting 
that STSs have evolved from CHSs more than once; 
indeed it was possible to reproduce such changes in 
specificity by site-directed mutagenesis. 190 Apart 
from typical CHSs and STSs, the superfamily in- 
cludes enzymes that generate other types of tetra- 
ketides that differ in the choice of starter unit, such 
as the bitter acids of hops (Humulus lupulus\ which 
start with isovaleiyl CoA or isobutyryl CoA. 191 
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Figure 16. Reactions catalyzed by plant chalcone syn- 
thases (CHS) and stilbene; synthases (STS) and structures 
of their "classical* products, nariagenin chalcone (60) and 
stilbene (61). Note that both synthases use as starter a CoA 
ester from the phenylpropanoid pathway, (such as couma- 
royl CoA where Rl = — OH -and "R2 == ^-H)) and three 
malonyl CoA extenders to generate a linear tetraketide, 
which folds differently to connect different carbon atoms 
(solid or open arrows) to produce either 60 or 61 (the latter 
with decarboxylation). (Reproduced with permission from 
ref 187a. Copyright Elsevier Science Ltd., The Boulevard, 
Langford Lane, Kidlington, Oxford OX5 1GB, UEL Figure 
kindly supplied by J. Schr&der.) 

Genetic manipulation has played a significant role 
in analysing CHS and SIB function, ^art from site- 
directed mutagenesis: to confirm ^ejdctiye-site cys- 
teine and to interconvert the two : cliELssra/pf einzyme, 
just mentioned, elegant in vitm iii^ 
tation analysis has suggested that -although the 
native enzymes are homodimefs, each monomer can 
perform all three condensation reactons, but comple- 
mentation between two inactive monomers to form 
an active enzyme suggests that the two monomers 
cooperate in other ways to generate product; 192 this 
might contrast with the situation for the vertebrate 
FAS (and perhaps the DEBS modules, section EH.D), 
in which two active centers on the homodimer arise 
by obligate cooperation between complementary sites 
on the two subunits. 169 * 170 Much work has also been 
carried out on the transcriptional control of CHS (see 
the review by Martin 193 for an account of these 
studies and of the various biological roles of the plant 
metabolites produced by the CHS). 
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Table 4. Possible Genetic Events in the Evolution of FASs and PKSs 



Hopwood 



genetic events 0 



mutation and recombination 
duplication 



consequences; selective pressure; possible examples 



deletion 



fusion 



recruitment 



horizontal transfer 



OI ^°n^nrC^^ 

origin of FASs from PKSs (or vice versa), and of diverse PKSs from each other- to 
increase chemical versatility uuier; to 

origin of multiple modules, giving rise to "assembly line* programming- to increase 
chemical complexity of polyketides increase 

OTi ^f CLF fr ° m P™"*"^ KS (Figure 9), perhaps generating heterodimeric 
condensing enzyme from homodimeric progenitor; to better control chain length? 

origin of didomain aromatase from monodomain progenitor (Figure 10): to handle * 
reduced carbocychc rings more efficiently? 

°^r^J^ Ta FASS ( ^ r 4 some sub ^ thereof); to provide specialized secondary 
metabolites; e.g., anatoxins, nod factors 

del ^°° of whole modules to Morten macrolide chains; to diversify product 
st^cture; e.g hygrolidin vs conconamycin, or methymycin vs eiythromycin 205 

deletion of one two, or three reductive domains (ER, DH, KR) from specific 
modules; to diversify functionality of macroUde chain 

origin of type I from type H synthases (at least two fusion cycles needed to 
give rise to vertebrate and fungal synthases); to increase catalytic efficiency 
^rtms equimolar synthesis of catalytic sites in the absence 

origin of multimodular proteins from unimodular progenitors, converting 
noncovalent to covalent joints in modular PKS; to increase fidelity of 
bunding the "assembly line"? * 

of^ctive cy^e eu^mes (KR, DH, ER), converting a primordial* PKS into a FAS* 
to generate hpids for membrane biogenesis ' 

nf lSt S / n l^ 0m ? t ^ eS; to f d ^troUed ring formation in aromatic polyketides 
of loading functions; to diversify structure of starter units ^ 

after^s^Dly 115 ^ *° diversiftr modificati ons of nascent polyketide chains 

of g^up transfers occurring during chain building; to diversify chemical 

structure; e.g., methylation in mevinolin 
acquisition of genes from a donor other than by inheritance from an ancestor- 
to^lerateadaptation; e.g., fungal PKS or FAS, or type I modules , 
between bacteria (mycobacteria vs actinomycetes?) 



■ Sometimes the event could be the converse of the one illustrated: e.g., addition vs deletion of modules or domains. 



VI. Phytogeny, Horizontal Gene Transfer, and 
Developmental Crosstalk: Some Speculations 

A Phylogenetic Relationships between 
Synthases 

A bonus from the considerable recent interest in 
the primary structures of the proteins that make up 
the FASs and PKSs of a wide range of organisms is 
the creation of a large database of sequences that is 
potentially available for phylogenetic analysis. Al- 
ready this has given rise to some interesting evolu- 
tionary speculations. 

As discussed in section V, both sequence and 
biochemical data indicate that plant chalcone and 
stilbene synthases very likely represent a family of 
related enzymes with a separate origin from all the 
other enzymes that build carbon chains by catalyzing 
repeated decarboxylative condensations between small 
carboxylic acid residues. However, sequence com- 
parisons suggest that all these other PKSs and FASs 
could well have had a common origin. It is possible 
to construct a relatedness tree that includes all the 
condensing enzyme subunits of the type H synthases, 
as well as the homologous domains of the type f 
enzymes; 194 ' 195 the same can be done with the ACPs 196 
However, there are obstacles to carrying out a proper 
phylogenetic analysis of the data. One problem is 
the wide diversity of sequences; on the hypothesis of 
a common origin, this would reflect the vast evolu- 
tionary time scale over which sequence divergence 
has occurred. The result is that any tree that 



includes all the sequences is likely to be rather 
arbitrary in the relationships between its deeper 
branches. Another is the inevitably subjective choice 
of the boundaries between the functional domains of 
the type I enzymes. There may well be a functional 
need for "spacer* regions between the various cata- 
lytic sites, to allow them to assume the correct three- 
dimensional relationships with other sites in the 
same polypeptide chain or with partner chain « in a 
dimeric or multimeric structure; in these regions the 
constraints are therefore likely to be geometric rather 
than involving specific amino acid residues, so the 
degree of conservation of sequence is likely to differ 
for domain and interdomain regions. 132 Depending 
on where the sequences are "cut" for inclusion in the 
analysis, a different tree will therefore result; and 
differences in detail between trees generated with, 
for example, KS or ACP sequences may at least in 
part reflect these factors. Thus a general question 
that arises in all phylogenetic sequence analysis— to 
what extent do amino acid sequence similarities (or 
differences) reflect functional constraints, rather than 
evolutionary relatedness (or divergence)-is particu- 
larly difficult to deal with. . 

In spite of such problems, which hopefully will not 
permanently stand in the way of a rigorous phylo- 
genetic analysis, it is interesting to speculate a little 
on the genetic processes that would have occurred 
on the hypothesis of a common origin for the PKSs 
and FASs (Table 4) and to point out some of the 
anecdotal evidence for them. Meanwhile, it is rel- 
evant to note an observation, which could of course 
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be overturned at any moment by further research, 
that fatty acids and polyketides are apparently 
absent from the Archaea (their membranes contain 
isoprenoid ethers instead of the fatty acid-containing 
phospholipids that are found in the other two branches 
of the living world). The recent publication of the 
complete genome sequence of an Archaeon agrees 
with the failure to detect polyketide or fatty acid 
metabolites in members of the Archaea: no recogniz- 
able KS is seen in the sequence of the Methanococc us 
jannaschii genome. 197 If this absence of PKSs and 
FASs from the Archaea holds up, perhaps they lost 
the potential for this aspect of metabolism after they 
became separate from the Bacteria and the Eucarya, 
or perhaps the last common ancestor of the three 
kingdoms had not yet evolved it, but did so after the 
Archaeal branch separated and before the Bacteria 
and Eucarya diverged. 

On jthe hypothesis of a single common origin for 
typical FASs and PKSs, an early ancestor of present- 
day bacteria and eukaryotes might have evolved a 
primitive condensing enzyme that recruited, other 
functions to become more efficient; Addition of an acyl 
carrier protein and acyl transferases could have given 
rise to a rudimentary PKS, perhaps followed by 
recruitment of the reductive cycle to convert it to a 
FAS. The resulting primordial multifunctional syn- 
thase jwrould have become further improved, and 
diversified by subsequent mutation, recombination 
between diverged gene sequences, and gene duplica- . 
tion, opening the way for PKSs to evolve the ability 
to generate chemically distinct products, while the 
(by then) essential function.of fatty acid biosynthesis . 
could be retained by the organisms' FAS..- 

A type II structure is likely to have been primor- 
dial, with subsequent gene fusion generating the type 
I synthases. The finding that certain domains from 
the vertebrate type I FAS, separated from the mul- 
tifunctional protein by, proteolytic cleavage or genetic 
engineering, can function biochemically^' 19 ?" 200 is 
consistent with (although does not prove) an origin 
of the type I enzymes by domain fusion. Until the 
discovery of the type I bacterial synthases— the 
modular PKSs and the nonmodular FASs of Brevi- 
bacterium 13 and related bacteria 201 — the simplest 
hypothesis was that gene fusion accompanied the - 
transition to eukaryotic cellular, architecture. This 
perhaps reflected the need for coordinate, equimolar 
synthesis, of catalytic sites hi eukaryotes in the 
absence of the operon organization that, tin bacteria, 
allows cotranscription and cotranslation of groups of 
adjacent genes. In this respect, the FASs and PKSs 
are not unusual: there are other examples of similar 
biochemistry being performed in bacteria by sets of 
discrete enzymes and in eukaryotes by single mul- 
tifunctional enzymes. 202 Now it is a matter of more 
conjecture whether the proposed domain fusions 
actually occurred in a prokaryote and were inherited : 
by a eukaryotic descendant; or in a eukaryote, after 
which they were transferred back to the ancestors 
of those present day bacteria that use type I syn- 
thases. 

In any event, a miniim-im c f two ancestral fusion 
cycles is needed to explain the fact that the type I 
synthases appear to form two families, based on 



domain order and nucleotide sequences: (1) the 
vertebrate FASs, the fungal PKSs, and each module 
of the bacterial modular PKSs; and (2) the fungal 
FASs and bacterial type I FASs (so far identified in 
Breuibacterium and Mycobacterium) in the other. The 
resemblance in FAS architecture between Brevibac- 
terium and the fungi has recently been underlined 
by the unexpected finding of a second type I FAS in 
Breuibacterium ammoniagenes which not only re- 
sembles the fungal FASs in domain order, as does 
the first such enzyme to be discovered, 13 but even in 
carrying the site of a member of the newly discovered 
phosphopantetheinyltransferase superfamily to add 
the prosthetic group to the acyl carrier domain 203 in 
its C-terminus. 204 Resemblances between genes such 
as these, which appear to be at variance with 
taxonomic relationships, are highly suggestive of 
horizontal gene transfer. This term describes the 
hypothesis that genes may be acquired by one organ- 
ism from another, not closely related to it, by a 
process other than their normal inheritance from an 
ancestor. Horizontal transfer is difficult to prove, but 
could be supported if differences emerged between 
phylogenetic trees based on FAS/PKS sequences and 
trees constructed from genes such as those for 
primary metabolic pathways. 

We are pn firmer ground in attributing, the multi- 
modular structure of the . bacterial macrolide PKSs 
to repeated rounds of .gene duplication, presumably 
within the bacteria themselves since' there is (so far) 
no eukaryotic example. Support for. this idea comes 
from the high degree, of sequence resemblance at the 
DNA level between some of the modules!. The current 
overall head-t^head/taa-to-tail models for the three- 
dimensional structure of tjie modular PKSs (section 
HLD) , would accommodate very 'well a process of 
module addition, sometimes with'jgene fusion to 
generate covalent bonds between adjacent modules, 
and . sometimes without^ thus leaving, a number of 
noncovalent joints" in the assembly line. Synthases 
for increasingly complex polyketides could have 
evolved in this way. Subsequent exchange of mod- 
ules could have contributed to the origin of a multi- 
tude of differently programmed PKSs from a limited 
number of components. , As pointed out by Schwecke 
et aZ., 147 the 14-module rapamycin PKS and the 
(presumed) .10-module F!K506 PKS share .an almost 
identically programmed four-module subunit QEtapC 
and FkbA respectively), which would catalyze the last 
four condensations needed to build the rapamycin 
(42) and FK506 (43) polyketides (Figure 13); and 
Motamedi et aL 149 found that the sequences of some 
corresponding domains are more similar between 
RapC and FkbA than among modules in FkbA itself, 
suggesting that the two synthases may have acquired 
this subunit by a recent (horizontal) genetic ex- 
change. Again, because the codon usage of the 
putative oleandomycin PKS gene of S. antibioticus 
is atypical for Streptomyces, it was suggested that 
this was a relatively recent acquisition A mosaic 
genetic architecture that evolved by. the accretion, 
subtraction, and exchange of modules in various 
combinations could well underlie the striking obser- 
vation that the ..large number of naturally occurring 
macrocyclic lactones and polyethers actually share 
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Table 5, Structures of Representative Polyketide and Fatty Acid Synthases 



Hopwood 



structural class 
of synthase 



iterative 



programming strategy 



modular 



type I 



type II 



vertebrate FAS: one subunit 
fungal PKS: one subunit 
fungal FAS: two subunits 
Brevibacterium FAS: one subunit 0 
E. coli FAS: seven subunits 
actinorhodin PKS: six subunits* 



erythromycin PKS: six modules, three subunits 
rapamyriii PKS: 14 modules, three subunits 
spiramycin PKS: seven modules, five subunits 

none known 



on?^^? B f h Z^,&^r^ P ^^° n ' - dm ^^ s associate noncovalenay- > Depends 



a considerable degree of structural regularity, some- 
times differing by a single chain-extender unit. 205 

B. Iterative vs Modular Programming 

The most significant distinction between different 
types of organization among the FASs and PKSs is 
not whether the various catalytic sites that make up 
the toolkit of the synthase are covalently associated, 
as in the type I enzymes, or only noncoyalently, as 
in the type II systems. In terms of programming 
me chanisms , the most relevant distinction is between 
synthases which carry a single set of sites that act 
iteratively in successive rounds of chain assembly 
and reduction, and those with a module of sites for 
each round, with each site acting just once in the 
building of an entire carbon chain. Table" 5 ! exempli- 
fies these distinctions. It' reminds us i thkt -Main 
the type I systems, the coinplete ■ s^thase niay 
consist of a series of different protein subunits (up 
to five among known examples)'that presumably dock 
together to build the multLfunctibhal enzyme Pre- 
sumably this docking must occur quite specifically 
in the case of the modular PKSs in order tb establish 
the complete "assembly lmeV^and the models of 
modular PKS structure allow for such h^ad-to^tail 
docking. It is not obvious why a modular synthase 
should consist of any specific number of siibunits 
(why for example the erythromycin Md rapamycin 
PKSs are each built from three and the spiramycin 
PKS from five components); and after all seven 
monofunctional proteins can associate productively 
in the E. coli FAS. One could therefore imagine the 
existence of a type II modular PKS in which ap- 
propriate docking of domains, within >as \frell as 
between modules, allowed it to function correctly, but 
no example has been discovered so far. A* &valent 
structure for the modules is doubtless more efficient. 

What was the selective pressure for the -modular 
PKS to evolve? Two biocherhical features currently 
correlate with the iterative vs modular construction 
of known FASs and PKSs. One is the use of exclu- 
sively malonyl CoA extender units, vs the use of at 
least a proportion of more complex chain extenders. 
The iterative syn t hases, including both the type I and 
the type II FASs, and the type I and type II synthases 
for the aromatic class of PKSs, use malonyl CoA for 
every round of chain extension (exceptions include 
FASs that introduce methyl branches, 1 but their 
pattern is quite repetitive and may not present a 
programming challenge), whereas the modular PKSs 
use either all methylmalonyl extenders (which yield 
two alternative stereochemical configurations for the 
incorporated propionate residues) as in the case of 



the erythromycin PKS (section HID), or a precise 
sequence of malonyl, methylmalonyl, or more com- 
plex extenders (for example an ethylmalonyl unit to 
yield a butyrate residue in tylosin (38) and spiramy- : 
cin (39), or glycerol-derived extenders in soraphen 
(45): Figure 13). Perhaps the choices necessitated 
by this variety of extender units were too hard to 
program into the iterative synthase, which is busy 
dealing with the issue of chain length that follows 
automatically from the number of modules in the 
modular PKSs. The second feature of the modular 
PKSs is the complexity of the programming of the 
reductive cycle that could generate five functional- 
ities (keto, R hydroxy, S hydroxy, enoyl, or methyl- 
ene) at each round of chain building, giving in 
principle 5 n possibilities, where n is the number of 
rounds (e.g., 15 625 for erythromycin, of which just 
one is chosen!). In contrast, aromatic PKSfc usually 
make very simple reductive changes to the j8-keto 
groups of the growing chain: e.g. , none for resorcinol 
or tetracenomycin, or one reduction/dehydration for 
6-methylsalicylic acid or actinorhodin. (And for 
examples such as actinorhodin there are even grounds 
to believe that the keto group to be reduced is 
actually recognized after chain assembly is complete, 
rather than reflecting a choice during chain build- 
ing. 73 ) Carrying out the complexity of reductive 
programming needed to build a macrolide aglycon 
may well be beyond the capacity of an iterative 
synthase. 

Hopefully, a fuller understanding of the precise 
mechanisms of programming choices will eventually 
allow a distinction to be made between complexity 
of chain extender choices and of reductive cycle 
choices as a driving force in the evolution of the 
modular PKS. Meanwhile, what are needed are 
examples of PKSs that are required to make one or 
the other, but not both, kinds of choices. The 
polyketide chain of mevinolin (59: Figure 15), which 
is all acetate-derived and is made by an iterative 
synthase, contains double bonds and hydroxy! groups 
(discussed in ref 21), suggesting the need to program 
a significant number of reductive choices. Again, at 
least a part of the biosynthesis of coronafacic acid, 
the polyketide component of the Pseudomonas phy- 
totoxin coronatine, is catalyzed by a type II (presum- 
ably iterative) PKS, but it is not at all certain that 
the product of this PKS includes the moiety of 
coronafacic acid that may need complex program- 
ming. 206 A further interesting case concerns pyolu- 
teorin biosynthesis by Pseudomonas fluorescens. 206 * 
This is an aromatic polyketide assembled from all 
malonate CoA extender units, yet the PKS is modu- 
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lar. Perhaps this reflects the need for a ketoreduc- 
tion associated with the third of these condensations. 

C. Developmental Crosstalk 

On the hypothesis of gene duplication and diver- 
gence in FAS and PKS evolution, many present day 
organisms will have come to contain two or more 
FASs and/or PKSs encoded by genes or gene clusters 
that have a common origin. These phylogenetically 
related synthases can be expected to retain varying 
degrees of functional and structural similarity. Do 
they also retain the potential for biochemical crosstalk, 
and if so are there situations in which this is 
exploited, perhaps as part of a developmental pro- 
gression, or to maximize the metabolic diversity 
available from the appropriate expression of subsets 
of the structural genes? As the broader biology of 
fatty acid and polyketide synthesis is increasingly 
Investigated, convincing evidence of productive cross- 
talk is indeed emerging. A few examples will serve 
as illustrations of what is becoming a fascinating field 
of study. 

As described in section II, Streptomyces coelicplor 
contains two separate clusters of stmcturaUy similar 
genes that encode the type II PKSs for two different 
aromatic polyketides: the octaketdde antibiotic acti- 
norhodin and a (presumed) dodecaketide spore pig- 
ment. That at least some of the subunits of the two 
PKSs are (still) functionally compatible was shown 
by the artificial complementation of deletion or point ; 
mutations of the genes encoding any of the three 
TniTiiTTifll PKS subunits of either PKS by the corre- 
sponding subunits from the other. 78 ' 109 Such crosstalk 
does not, however, occur naturally: mutations in the 
genes for one PKS are not compensated for by the 
unmutated genes for the other when the gene sets 
are expressed normally. Presumably this isibecause 
the normal expression pattern; ensures that the 
actinorhodin PKS is made only in vegetative parts 
of the colony, while genes for the. spore pigmenfrPKS 
are expressed only in the developing spores. Prob- 
ably there has been no selective pressure for the two 
PKSs to diverge far enough to avoid unwanted 
crosstalk because the opportunity for this does not 
arise. A different situation applies to the subunits 
of the type II FAS of S. coelicolor. Artificial coex- .' 
pression of the FAS ACP (the other subunits have 
not been tested) with components" of the actinorhodin 
PKS resulted in only a minute level of complemen- 
tation: in this case, biochemical divergence has 
probably ensured the absence of unwanted crosstalk, 
which could have occurred because fattyacid biosyn- 
thesis may well still be needed in those parts of the 
colony that have started to make actinorhodin. 207 
Interestingly, however, the gene for an essential 
component of any type II synthase, a malonyl trans- 
ferase to transfer malonyl units from CoA to the 
prosthetic group of the ACP for chain extension, is 
found in the FAS gene cluster but in neither of the 
PKS clusters. In this case, conserved biochemical 
crosstalk may be essential to provide the two PKSs 
with a functional malonyl transferase, 208 and it is 
possible that fatty acid and polyketide synthases 
could even share the first condensation reaction. 207 
Another example concerns the fatty acid side 
chains of the lipooligosaccharide signaling molecules 
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Figure 17. Structures of lipooligosaccharide nodu latioii 
factors produced by Rhizobium legununosaruTn straios. In 
the absence of the nodE-encoded KS and nodF-encoded 
ACP, the acyl side chain is 18:1 characteristic of cis- 
vaccenic acid (62). When nodE and nodF are present, the 
18:1 acyl side r^p ™ is replaced, in a proportion of the 
molecules, by the 18:4 unsaturated 63 & produce the highly 
bioactiye Nod factor. (Modified from 209, Copyright 1991 
Macmillan Magazines Ltd.) . 

produced by rhizobia. These molecules initiate the 
plant response that results in engulfment of the 
bacteria by the root hairs of leguminous plants and 
eventually leads to a nitrogen-fixing symbiosis within 
a specialized plant organ, the .root nodule. The 
specificity of each signaling molecule for particular 
species of legume often depends .on the length and 
pattern of unsaturation of the fatty acid side chains 
(e.g., 62 and 63: Figure 17) 209 These appear to be 
made by a controlled kind of crosstalk between the 
type II FAS of the bacteria, which would synthesize 
a generic, largely saturated carbon chain up to at 
least Cio, and specific KS and ACP subunits encoded 
by nodulation genes, nodE. and nodF (and in the case 
of Rhizobium meliloU also a KR encoded by nodG), 
which extend the chain in a specific manner, al- 
though presumably with the cooperation of other 
essential FAS subunits that are not provided de novo 
as nod gene products; and indeed there is an appar- 
ent competition between the primary FAS and the 
"nodulation FAS", so that a mixture of end products 
(62 and 63) is seen. 209 

There is a growing list of other situations in which 
fatty acid moieties form an integral part of specialized 
metabolites. Sometimes the evidence points to a gene 
duplication that has given rise to a complete special 
FAS for this role, distinct from the one involved in 
primary metabolism; a good example is the origin of 
the hexanoyl starter unit for norsolorinic acid bio- 
synthesis in Aspergillus (see section IV)- 185 Other 
situations may resemble the Rhizobium case, where 
the FAS of primary metabolism plays a second role 
by providing partially extended fatty acid c hains that 
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aS^P 8 ;« St 7 ctu ^ e £ ofth «* members of the lipopeptide 
A21978C complex of Streptomyces roseosporus The fettv 

bu^ryl CoA and 65 with isobutyl CoA or 3-methylbuS 
CoA, just as in the case of some of the major branched 

^^ C ^ d lu° r secondar y metaboUte biosynthesis, 
probably via the agency of . the ACPs. An example ii 
provided by the acyl side chains of the JV-acyl ho- 
moserme lactones involved in quorum sensing in 
various Gram-negative bacteria, where ah acyl ACP 
toS^S t^ 1 " *** to theSSme 

S£SSS? el, S! n S ^ fatty add side cWs 
ot Upopeptide antibiotics such as the A21978C coin- 

S^S^ Pt ° my f eS r° seo ^ 0 ^ lu may well origi- 
ZtZ S^Tu y ^ because of *» ™ry suggest 
th ^ ee ma J° r components of the complex 

StS^ 31111 Cl3 branched side dh ^ s <«£ 

of «o^f >; C ? ITe ? >ond to truncated versions 
mycete m ° abundant fat *y a ^ of strepto- 



W/. Concluding Remarks 



This article has been wide-ranging and has I hoDe 
given a flavor of the multidisciplinarity ^f ! cuS 
studies of polyketide synthesis As a^enetiSt I 
have emphasised (some might say overemphasized!) 
the contributions of genetics in its various forms- 
mutagenesis, sequence analysis, natural and engi- 
neered recombination, gene expression, and phylo- 
genefac comparisons-to the development of knowl- 
edge in the field. Other contributors to this fecial 
S f Conical Reviews will, I am sure.SsS 
more on the chemistry and biochemistry of polyketide 
biosynthesis, and on the crucial part played by these 
daphnes in developing our understand It seem 
obvious that, in order to gain insight into the prerise 

SoIvXh S ° f ? 6Se multZiS 
biosynthetic machines, in vitro studies will become 
preeminent over the next few years, XougTof 
course they will continue to be facilitated by protein 

Sf^TSf ^ P ° SSible h * -^ected SSS 
esis and the over-expression of genes. GenetiTen 
gineenng will also be crucial m VneraSfartSr 
recombmants for chemical anlysis! while SSJSSfa" 



Hopwood 

vivo studies will always be needed + j- 

ornate between what is bSlf y Sibt Si 
what actually occurs in nature And Sn^L^ 
context of the biology of pdlyketiSfiuS £tt^3d 
synthases genetacs will surely continue to come ^ 
with novelties that will in turn sHmulate^StbS 
rounds of in vitro analysis. lurmer 
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